Hi Guys,
I think that I know what your problem is here. Do you have your code stored in the same SDRAM as the data. If so the SDRAM controller opens the bank and reads the data then opens another bank and reads the next bit of code. You can fix this in several ways:
1. Put your code somewhere else.
2. If you have an instruction cache and your code is in a loop this should be ok the nth time through the loop (where n != 1)
3. The SDRAM can have mutilple banks open, if the data and code are in different banks you still should get fast performance. Unfortunately the SDRAM controller from altera does not support this and will always close the bank rather than leaving it open when the new address is in another bank. The SDRAM controller needs to be quite a bit more complex to take care of this. We wrote one but not for avalon. You could write your own, it took us about 2 months to do this. I cant distribute as it is the property of my old company.
I could be wrong about this being the cause of your 12 cycles but to open a bank and do a read is about 5 cycles the next read should be 1 cycle. Changing banks (if bank is open) I think is 2 cycles. ie 3 cycle saving per read 12 cycles reduced to 6 (3 for the data read and 3 for the next instruction read.)
Good Luck.