Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
16 years ago

floating point operations with and without Custom instruction floating point hardware

Hi

I want to test the floating point multipication execution time in Nios II with and without Custiom Instruction floating point hardware.

I taught without adding Custom Instruction floating point hardware floating point additions or multiplications doesnt work in Nios II

but it works Surprisingly So is this mean that it computing in sw

i have written a sample code using timer for testing with and without floating point hardware

# include <stdio.h># include <system.h># include "altera_avalon_timer_regs.h"# include "sys/alt_timestamp.h"

int main()

{

float A = 32.432;

float B = 23.33333;

float C_MUL1,C_MUL2,C_MUL3,C_MUL4,C_MUL5,C_MUL6,C_MUL7,C_MUL8,C_MUL9,C_MUL10 = 0;

alt_u32 num_ticks = 0;

alt_u32 time1, time2,time_diff;

if (alt_timestamp_start() < 0)

{

printf("Timer init failed \n");

}

else

{

printf("Hello from Nios II!\n\n");

time1 = alt_timestamp();

C_MUL1 = A*B;

C_MUL2 = A*B;

C_MUL3 = A*B;

C_MUL4 = A*B;

C_MUL5 = A*B;

C_MUL6 = A*B;

C_MUL7 = A*B;

C_MUL8 = A*B;

C_MUL9 = A*B;

C_MUL10 = A*B;

C_MUL1 = A*B;

C_MUL2 = A*B;

C_MUL3 = A*B;

C_MUL4 = A*B;

C_MUL5 = A*B;

C_MUL6 = A*B;

C_MUL7 = A*B;

C_MUL8 = A*B;

C_MUL9 = A*B;

C_MUL10 = A*B;

C_MUL1 = A*B;

C_MUL2 = A*B;

C_MUL3 = A*B;

C_MUL4 = A*B;

C_MUL5 = A*B;

C_MUL6 = A*B;

C_MUL7 = A*B;

C_MUL8 = A*B;

C_MUL9 = A*B;

C_MUL10 = A*B;

time2 = alt_timestamp();

printf("time1 is %u\n",time1);

printf("time2 is %u\n",time2);

printf("time2-time1 = %u\n\n",time2-time1);

printf("C_MUL1 = %f",C_MUL1);

}

return 0;

}

But the results are almost coming the same

i am confused weather the Nios II processor is using the Floating point hardware or not

How to know this

regards

M Kalyansrinivas

5 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    You'll need to pass in compiler flags so that it knows that the hardware is present. To determine if it's working I normally look at the objdump file to see if software libraries are being replaced with the 'custom' mnemonic. Here is an old FPU custom instruction you can take a look at to see how to make the compiler aware of the hardware: http://www.nioswiki.com/custom_floating_point_unit

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Also if you declare constants and want to use single precision hardware use 'f' after the value otherwise the compiler will treat it as a double precision value (and the hardware will not be used).

    i.e. instead of a = 0.0 use a = 0.0f instead. Otherwise you would get this behavior even with a hardware single precision multiplier:

    y = 2.0 * 3.0; // compiler will treat this a double precision and not use hardware

    y = 2.0f * 3.0f; // compiler will use single precision hardware
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    As all your multiplications are with constant values, the compiler may be performing them at compile time and optimizing away the actual multiplication.

    If that is the case, try declaring your variables "volatile" which should prevent the compiler from optimizing away the multiplications.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    hi

    Instead of adding verilog files for creating a floating point custom instruction hardware i added through SOPC Builder by clicking on cpu(in Sopc) and in Custom Instructions tab added floating point hardware

    But System.h file is not showing any indication that floating point hardware is added unlike if i write any verilog code and add as custom instruction it indicates in System.h file

    I tried adding the f at the end of a floating point value but i didnt see any performance change

    I see one major difference in the logic utilization

    Occupied (35%) of fpga when no floating point hardware is added

    and when floating point hardware is added it occupied 53% of FPGA

    But performance is coming the same

    regards

    M Kalyansrinivas
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Sorry I thought you were connecting your own FPU to the processor. I didn't notice it at first but I think Kevin is right. The compiler is probably performing the calculation at compile time which would explain the run times are identical. You'll need to specify your variables as volatile to make sure they are not optimized away but I don't think that will prevent the compiler from performing the multiplications at compile time. If you assign A and B into variable and then perform the calculation then you should have the multiplies occuring at run time (using the volatile keyword as well)