Forum Discussion
Altera_Forum
Honored Contributor
21 years agoHi Camelot,
You may not know that your sdram reads are probably taking 11 or more clock *per read*. Even from onchip sram this is no better than 5 clocks. Writes to ram/sdram are fast so do whatever you can to avoid reads but don't worry about writes. Our solution was to create a Custom Instruction that bypasses the Avalon bus and all of its registered states. So now we read image data coming in from an external fifo in 2 clocks per word, do our operations on it and then store the results in sdram buffers. These buffers are later dma'd out of the usb port. (dma is always fast, but not too usefull if you need to operate on the data) The other killer for us was a heavy reliance on bit shifting registers to manipulate image data. It turns out that on the Cyclone these shifts take *one clock per bit*! Well to solve that we upgraded to the StratixI. (the lowest end StratixI barely squeezed into our BOM budget) I think there are other issues in play as well, but our image processing code runs much much better on the Stratix NiosII than on the Cyclone NiosII. Another thing is by using the /S instead of the /F Core you avoid one clock checking the data cache on reads. In your example, if you really only need to check for xf7 at say the end of a row you could dma each row-1 and then do a normal read for the tag. Hope this helps. Ken