Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
14 years ago

Strange Nios latency

I have SOPC Nios system with onchip_mem. Program (resides in onchip_mem) works very slowly.

In SignalTab I see strange Nios latency.

Time between two simple writes is 48 Nios clock cycles! Why so? I suppose it should take 4 cycles.

Test example

int main()

{

while(1)

{

IOWR(0x00020000, 0x1, 0x1);

IOWR(0x00020000, 0x2, 0x2);

}

}

9 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    What is there at that address? Is it a component? Does it freeze the CPU with a wait_request signal? Is that component or the on-chip memory shared with another master? Is that master active?

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    And if the slave is in a different clock domain than the master then the handshake can take some time too

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Also check the read latency setting in the onchip_mem properties. It defaults to 1 cycle, but maybe you inadvertently changed it.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Master only one - Nios, clock is the same.

    0x00020000 is onchip_mem and it haven't wait_request.

    How I can understand does Nios stalls or it have some activity? :confused:
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Read latency=1,

    In onchip_mem.v I see: the_altsyncram.outdata_reg_a = "UNREGISTERED"

    while in doc "On-chip memory components use synchronous, pipelined Avalon-MM slaves"

    What is correct?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Is the memory block 32bite wide?

    Also look at the generated code, how many instructions are there between the memory cycles?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    --- Quote End ---

    Onchip is 32bit wide.

    One IOWR (Nios IDE> Disassembly Window) have 4 instruction. So the time between real write (stwio) is 5 cycles. I have 55 cycles (for this code) >> 11 cycles for 1 instruction. ??? :mad:

    IOWR(0x00020000, 0x1, 0x1);

    0x0001007c <main+32>: movhi r3,2

    0x00010080 <main+36>: addi r3,r3,4

    0x00010084 <main+40>: movi r2,1

    0x00010088 <main+44>: stwio r2,0(r3)

    IOWR(0x00020000, 0x2, 0x2);

    0x0001008c <main+48>: movhi r3,2

    0x00010090 <main+52>: addi r3,r3,8

    0x00010094 <main+56>: movi r2,2

    0x00010098 <main+60>: stwio r2,0(r3)
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Found ! Nios adds 8 cycles to every instruction when you run in debug mode. I compiled program as Release and Run program. In Signal Tab see 2 cycles for IOWR.

    Thanks all ! ;)
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    You've not said which cpu you are using (/f /s or /e).

    Are you executing the code out of tightly coupled memory, or via the instruction cache?

    If you are using the instruction cache and are looking at the first 2 writes, then there may be instruction cache fills taking extra time.

    In that case, the times between the later writes would be faster.

    The /f cpu will execute 1 instruction every clock (from tightly coupled instruction memory or instruction cache). 11 clocks is somewhere near the value for a clock crossing bridge (10 clocks??).

    I've measured 3 clocks for Avalon MM transfers to local PIO (1 clock delay configured for reads and writes), on chip memory might to 0 clock delay writes.