Forum Discussion
9 Replies
- Altera_Forum
Honored Contributor
try using the numeric_std package to do this:
Have lots of redundant pipeline stage registers before the divide and turn register retiming on. From a discussion in another thread, it appears it will work, but not be as efficient as the megafunction.signal a : unsigned(71 downto 0); signal b : unsigned(71 downto 0); signal c : unsigned(15 downto 0); .. a <= b/c; - Altera_Forum
Honored Contributor
Thanks. I know this one. I forgot to mention that it cannot be pipelined version, and if so, then the latency must be known so I could resynchronize the data. (of course we are talking about synthesizable design)
d. - Altera_Forum
Honored Contributor
Then remove the pipelining registers and do the above code.
But if you're doing a massive divide like you are without pipelining, its not going to work very fast (depending on device, no more than 2MHz or so) - Altera_Forum
Honored Contributor
--- Quote Start --- Have lots of redundant pipeline stage registers before the divide and turn register retiming on. From a discussion in another thread, it appears it will work, but not be as efficient as the megafunction. --- Quote End --- my tests indicated that only 1 set of registers gets moved to the internal logic with register retiming. in that case it would be good to have 2 sets of input registers, and 1 set of output registers. 3 sets did not improve fmax. - Altera_Forum
Honored Contributor
If you don't care about latency you could implement this to be very small and fast through serial division. If the latency is too high then you can shift to a partial parallel division implementation which will be larger and slower in terms of fmax. Both are based on long division (you know that stuff you learned in school and never thought you would ever have to use it later in life :))
- Altera_Forum
Honored Contributor
Besides the interesting discussion about register retiming, it should be mentioned, that infered lpm_divide Megafunctions are limited to 64 bit width as well.
- Altera_Forum
Honored Contributor
Dear all,
thanks for fruitful hints. I decided to go for fully serial divider. I've found that the 64bit single clock works for 8MHz only (1s20 device), as i need at least 40MHz there is definitely no way how to achieve it using single clock cycle algo. This means to refurbish part of the design to queue the data stream d. - Altera_Forum
Honored Contributor
If you want an example of a fully serial divider let me know and I can look around for the verilog prototype I made that I shifted over into C code for C2H.
- Altera_Forum
Honored Contributor
That's ok. thanks. I'll just modify one I already did before.