Forum Discussion
Altera_Forum
Honored Contributor
13 years agoBoth the instruction and data cache Avalon cycles are fed through the same Avalon master interface as uncached data cycles. What you are talking about would probably require 2 or 3 separate avalon master interfaces (since you wouldn't want a 64bit bus for the uncached data accesses).
You'd then need to bridge the 32bit avalon bus to the 64bit one in order to allow uncached data accesses to memory. This is getting more and more logic - and that will slow things down further. More useful would be internal logic eg: - don't stall the cpu on memory read until the value is needed. - post avalon writes. - post cache writes waiting for cache line read. - predictive instruction cache line reads. - predictive data cache line reads. - read target address for branches (dual port instruction memory into cpu) in case branch taken. I actually suspect there is very little development of the nios cpu going on now, it is actually quite a few years old and fpga are quite a lot larger than they were when it was designed. There have been some changes for running things like linux (especially to the gcc config), but nothing for small code sytems (which require a different set of optimisations). I have some code that has hard real-time constraints (it has to get around a loop in under 194 clocks). This means I had to minimise the worst case code paths - not speed up the common ones. I know where every cycle stall is!