Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
14 years ago

what is the different between * and IP core multiplier

hi ,everyone ,i have a question about the mulitplier.

i am a newer in FPGA ,now , i wonder the different between the mulitplier

constusted by * in HDL and the one generated by Ip core ,what is the most important different between these two kinds of multipliers? the speed or any others? thaks

12 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    By any chance is the module you are testing this with at the top level? If so I suspect your input and output registers are being packed into the I/O. Either assign those inputs and outputs to virtual pins using the assignment editor or just shove a bunch of pipeline stages in front and after the multiplication in your HDL file. This will make sure you'll iscolate the multiplier from the I/O. So in other words do this:

    Register --> register --> register --> register --> multiply --> register --> register --> register --> register

    If this causes your timing problems to go away then don't worry, you won't need that kind of pipelining once you feed the multiplication with on-chip inputs and outputs (and if you do that means the surrounding logic could use some pipelining).
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    By any chance is the module you are testing this with at the top level? If so I suspect your input and output registers are being packed into the I/O. Either assign those inputs and outputs to virtual pins using the assignment editor or just shove a bunch of pipeline stages in front and after the multiplication in your HDL file. This will make sure you'll iscolate the multiplier from the I/O. So in other words do this:

    Register --> register --> register --> register --> multiply --> register --> register --> register --> register

    If this causes your timing problems to go away then don't worry, you won't need that kind of pipelining once you feed the multiplication with on-chip inputs and outputs (and if you do that means the surrounding logic could use some pipelining).

    --- Quote End ---

    The full history is that I had a design with a signed 32x32 signed multiply working fine. That multiply was not at the top level.

    Then, Quartus updated to 11.0 and that existing design suddenly failed timing by a huge margin. The failing paths were mostly through the 32x32 multiply. Fmax dropped from 100 MHz to 60 MHz. I tried rebuilding the design in 10.1 and it went back to 100 MHz. Long story short: Altera said they found the problem and it's a bug in 11.0.

    They didn't give a workaround, though, so I've been experimenting in the hopes that I can get code that will always meet timing while still being portable. Sadly, no amount of extra pipelining has brought the design back to the Quartus 10.1 fmax in Quartus 11.0.

    The fmax numbers from my last post were all based around top-level modules, though. I will try adding the extra registers like you mentioned to see if the numbers change.

    Update: Adding three additional series registers to both inputs and three additional series registers to the output increased fmax for both "*" and the Quartus II template code, but only by about 4 MHz.

    So now it's up to ~180 MHz for lpm_mult vs ~94 MHz for code.