Nios2 Performance

Honored Contributor

21 years ago

I didn't use any cycle count comparisons, but I used a standard digital output connected to a logic analyzer. I would set the output at the beginning of the function and then reset it. The function runs in a massive loop, so the output set/reset timing is negligable. The only difference between the code is how I do the set/reset of the digital line. Here is the fp code that was run on each processor:

In the main funct:

pio_data |= set_mask[2];

IOWR_ALTERA_AVALON_PIO_DATA (USER_PIO_BASE, pio_data);

fsum = TestFloatMult();

pio_data &= reset_mask[2];

IOWR_ALTERA_AVALON_PIO_DATA (USER_PIO_BASE, pio_data);

Fmult benchmark funct (simple mult/accum):

float TestFloatMult(void)

{

int i;

sum = 0;

ptr1 = &gFloatArray1[0]; // gFloatArray1 is a randomly generated float array

ptr2 = &gFloatArray2[0]; // gFloatArray2 is a randomly generated float array

for(i=0; i < ARRAY_SIZE; i++) // ARRAY_SIZE = 1000

{

sum += *ptr1++ * *ptr2++;

}

return sum;

}

What I found is that the ARM executed it in 940uS and the NIOS in 7.0mS. I went to great lengths (mix mode) to make sure that the ARM did not optimize the crap out of the loop. Everything looked normal.

Rick

Forum Discussion

Recent Discussions

licensing.altera.com never worked

NiosV and juart-terminal

NIOS V/m dbg_reset_out signal (Q25.1 Std, MAX10)

JTAG_UART stuck in printf

Ashling IDE scripted project creation