Forum Discussion
HRZ
Frequent Contributor
7 years ago1366 GFLOP/s is actually at 450 MHz. What you guys are missing is the fact that each DSP on Arria 10 can perform one single-precision floating-point FMA (Fused Multiply and ADD) operation per clock which counts as two FLOPs. 450 * 2 * 1518 = 1366 GFLOP/s. The peak Fmax of the DSPs on Arria 10 is around 450-500 MHz depending on the speed-grade. Realistically, even in best-optimized designs, the achievable GFLOP/s on this device is around 700-800 MHz. Even Intel's highly-optimized Matrix Multiplication library can hardly achieve over 900 GFLOP/s based on their own paper.