Forum Discussion
Altera_Forum
Honored Contributor
7 years agoOh, that paper... That is not really an OpenCL design. They have coded pretty much everything in System Verilog, and then packaged it into an OpenCL kernel as an HDL library. The could do two 16-bit MULs per DSP since they were describing their computation in a low-level language, and that is also how they managed to achieve such high operating frequency. The paper from Intel, however, claims to describe the design purely in OpenCL.
And yes, the OpenCL report claims it is implementing 16-bit x 16-bit MUL, but as you saw yourself, it is still not actually capable of packing two such MULs into one DSP and after placement and routing, you still get only one MUL per DSP. That is why I think achieving such behavior in OpenCL might require bit masking.