Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
10 years ago

how many cycles does each command take in nios?

in my project, i make a loop to blink a led:

int main(void)

{

int i=0;

for(i=0;i<100;i++)

{

IOWR_ALTERA_AVALON_PIO_DATA(QD_PIO_0_BASE,0xff);

delay();

IOWR_ALTERA_AVALON_PIO_DATA(QD_PIO_0_BASE,0x0);

delay();

// printf("Hello NIOS II! %d\n",i);

}

return 0;

}

void delay(void)

{

alt_u32 i =0;

while(i < 100000)

{

i++;

}

}

the output wave is 30ms period, and my clk frequency is 100M,

so it looks like the "i++", takes 15 clock cycles, is it right?

how can i know each command takes how many cycles in my project?

3 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    If you are worried about how long NIOS instructions take, be aware that NIOS is a very slow processor. The free one is staggeringly slow and inefficent. If this is a concern, use one of the SoC chips with built in ARM processor or write your algorithm in Verilog or VHDL. Almost any external micro will be faster than NIOS as well.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    thanks for your reply.

    i am not worried about it, i just want to know how long NIOS instructions take,

    this may be helpful.

    i read the objdump file, i looks like a assembly language, about the i++, it shows:

    void delay(void)

    {

    alt_u32 i =0;

    while(i < 100000)

    80031c: e0ffff17 ldw r3,-4(fp)

    800320: 008000b4 movhi r2,2

    800324: 10a1a7c4 addi r2,r2,-31073

    800328: 10fff92e bgeu r2,r3,800310 <__reset+0xff7f8310>

    {

    i++;

    }

    }

    80032c: e037883a mov sp,fp

    800330: df000017 ldw fp,0(sp)

    800334: dec00104 addi sp,sp,4

    800338: f800283a ret

    does each line cost one clock?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    https://www.altera.com/content/dam/altera-www/global/en_us/pdfs/literature/hb/nios2/n2cpu_nii5v1.pdf

    See "Instruction Performance" on page 5-11, 5-19, or 5-21 depending on what core you're using.

    Your question was asking about instruction performance, but if you really just care about higher-level C function/loop execution times, AN391 is a good read: https://www.altera.com/content/dam/altera-www/global/en_us/pdfs/literature/an/an391.pdf Especially the Performance Counter IP block is very useful.

    Many things can be done in a single cycle. But getting the compiler to emit the best code, and constructing optimized hardware, can all become a small research project by themselves.

    For example, if you just rewrote your delay() in a form that GCC likes just a little bit better, it looks like it would average (3) cycles per loop iteration on an "F" core.

    
    void delay(void)
    {
      register int i =0;
      const register int limit = 100000;
      for(i=0; i < limit; i++) {
      }
    }
    

    And the assembly (gcc -S foo.c): (.L3 is the loop iterator increment, followed by the .L2 "blt" compare against the 100000)

    
    delay:
            addi    sp, sp, -12
            stw     fp, 8(sp)
            stw     r17, 4(sp)
            stw     r16, 0(sp)
            addi    fp, sp, 8
            mov     r17, zero
            movhi   r16, 2
            addi    r16, r16, -31072
            mov     r17, zero
            br      .L2
    .L3:
            addi    r17, r17, 1
    .L2:
            blt     r17, r16, .L3
            addi    sp, fp, -8
            ldw     fp, 8(sp)
            ldw     r17, 4(sp)
            ldw     r16, 0(sp)
            addi    sp, sp, 12
            ret