Forum Discussion
HRZ
Frequent Contributor
6 years agoIn my experience, the OpenCL compiler generates math IP cores based on the size of the result, not the operands. If your result is 20 bits, then you will not get one FMA (Fused Multiply and Add) per DSP. I think the result should be less than 18 bits (or be exactly 32-bit float) to get FMA, since there are two 18x19 multipliers in each DSP. Also, if you are using anything below Quartus/AOC v18.1, do not expect good mapping of math operations to DSPs.