Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
16 years ago

72-bit version of LPM_DIVIDE

Dear All,

i need to implement division where numerator is 72bits, denominator 16bits. Of course it does not synthesize as numerator limit is 64bits. Any ideas how to solve such situation?

thanks a lot

david

9 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    try using the numeric_std package to do this:

    
    signal a : unsigned(71 downto 0);
    signal b : unsigned(71 downto 0);
    signal c : unsigned(15 downto 0);
    ..
    a <= b/c;
    

    Have lots of redundant pipeline stage registers before the divide and turn register retiming on.

    From a discussion in another thread, it appears it will work, but not be as efficient as the megafunction.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Thanks. I know this one. I forgot to mention that it cannot be pipelined version, and if so, then the latency must be known so I could resynchronize the data. (of course we are talking about synthesizable design)

    d.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Then remove the pipelining registers and do the above code.

    But if you're doing a massive divide like you are without pipelining, its not going to work very fast (depending on device, no more than 2MHz or so)
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Have lots of redundant pipeline stage registers before the divide and turn register retiming on.

    From a discussion in another thread, it appears it will work, but not be as efficient as the megafunction.

    --- Quote End ---

    my tests indicated that only 1 set of registers gets moved to the internal logic with register retiming. in that case it would be good to have 2 sets of input registers, and 1 set of output registers. 3 sets did not improve fmax.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    If you don't care about latency you could implement this to be very small and fast through serial division. If the latency is too high then you can shift to a partial parallel division implementation which will be larger and slower in terms of fmax. Both are based on long division (you know that stuff you learned in school and never thought you would ever have to use it later in life :))

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Besides the interesting discussion about register retiming, it should be mentioned, that infered lpm_divide Megafunctions are limited to 64 bit width as well.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Dear all,

    thanks for fruitful hints. I decided to go for fully serial divider. I've found that the 64bit single clock works for 8MHz only (1s20 device), as i need at least 40MHz there is definitely no way how to achieve it using single clock cycle algo. This means to refurbish part of the design to queue the data stream

    d.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    If you want an example of a fully serial divider let me know and I can look around for the verilog prototype I made that I shifted over into C code for C2H.