Forum Discussion
The information in the datasheet is the best case max speed - with no routing in or out of the parts. The FMax your design can acheive is dependent on your code.
6 years later I have similar question. I need multiplier for 24*32 bits, a huge one, and Quartus says it needs eight 9x9 multipliers and 84 LUTs.
But the subject of cascaded multiplier is not covered anywhere in detail. From marketing perspective discussed here things are perfect, but it is not clear how well cascaded multipliers perform (as I guess decent DSP will need bigger multipliers than 18x18 and this MUST be covered somehow), and connection circuit is also not clear.
By the way, this material says:
>The embedded multipliers are also seamlessly integrated with the embedded memory blocks in Cyclone III FPGAs. This provides an efficient implementation of DSP algorithms that uses both multiplication and memory operations, such as FIR filters and video processing.
I am doing exactly this - FIR design, and want to know how cool Cyclone architecture helps me in doing it best way. Does it assume using FIR IP core? I see its interface is serial, and timing is not clear (at least learning how it works will be more effort and time than designing own), so I would better design my own.