--- Quote Start ---
say you have a design with a pipeline depth of 10. it runs at 100 MHz. you put your data in at time=0, and at time = 10*(1/100,000,000) seconds, you receive your output result. in this case, a new result will be calculated every 1/100,000,000 seconds.
in another version of the design, you only have a pipeline depth of 1. it only runs at 10 MHz. you put your data in at time=0, and at time = 1*(1/10,000,000) seconds, you receive your result. in this case a new result will be calculated every 1/10,000,000 seconds.
the 100 MHz design has much better throughput and the same "real-world" time latency (but 10 actual clock cycles) as the 10 MHz design at the expense of 10x the register usage.
--- Quote End ---
Thanks for explaining in great detail. I think I finally got it!
So in the 100MHz design, I would get 10 results at once, instead of only one result in the 10 MHz case?
Thanks to vjalter as well, that really helped as well, guess i was searching the wrong terms, I kept googling "output latency vs clock frequency" and kept getting random stuff.
Thanks everyone. Makes much more sense now.
Best regards,
Chris