Hello,
maybe a configuration option could be added that reduces fmax but increases memory throughput? Just from reading the above explanations, maybe a few optimizations are be possible:
1) 1 tick for cache miss - if I use ldwio and friends, I know in advance this is a cache miss
2) Combine "prepare read / read signals asserted" to one tick
3) Squeeze one tick out of SDRAM controller ("SDRAM-controller needs 3 clocks to assert CAS after chip-selected internally")
4) Let the SDRAM controller read a few bytes in advance if it's "read job queue" is empty ("speculative prereading"). This would at least accelerate memcpy and restoring context from stack (but of course not random reads).
I'd guess the clock tick overhead in 2) and 3) are just to achieve a higher fmax, but without having the NIOS source code and taking a closer look at it this is of course some wild guessing... But if you can get three ticks out of there, you get close to that "X-brand" processor
http://forum.niosforum.com/work2/style_emoticons/<#EMO_DIR#>/wink.gif Depending on how much you need to reduce fmax for that (e.g. I only need 60-70MHz) this would be a helpful optimization.
Are you internally discussing to perform a few optimizations on this issue?
Dirk