HW Multiplier

Honored Contributor

21 years ago

Another try :

When implemeting the simple sequence as above as a custom instruction, it saves already a lot of time, for only 280 LEs in a cyclone.

I have measured the time for this

int r;
volatile int* pr = &r; //prevent optimalisations
for (int a = -1000; a < 1000; a++)
   for (int b = 1000; b >-1000; b--)  //count backwards : prevent optimalisations
      *pr = a*b;

Without cutom instruction : this takes about 22.4 seconds at 50Mc with the standard NiosII

With the custominstruction in the mulsi function : 4.3 seconds (without frame pointer stuff)

with the custominstruction inlined (without the additinal function call and ret instruction) : 2.4 seconds.

This is a very big advantage for only 280 LE's I think. (22.4 seconds down to 4.3 seconds for 4.000.000 muls and overhead for the loops)

But I think the compiler still thinks that the cost for a multiply is very high, so it wants to optimise to shifts and adds where possible. This can reduce the benefit of this code a lot.

If anyone is interested, I'll post the verilog code for the custom instruction.

Forum Discussion

Recent Discussions

NiosV and juart-terminal

Nios V license

NIOS does not start after SW download (timing issue?)

DK-DEV-AGI027-RA: JTAG chain broken after Nios V Hello, FPGA recovery fails

Ashling RISC Free IDE fails to download ELF file