Forum Discussion
Altera_Forum
Honored Contributor
8 years agoThe FLOPS numbers that you are reporting do not make sense; no current FPGA can get even remotely close to 1.5 or 4.3 TFLOPS. Are you sure you are timing your kernels and calculating the FLOPS correctly?
Apart from this, since the operation inside of your loop does not depend on i, chances are, during synthesis the circuit gets heavily simplified and both turn into something that does not include the for loop but instead, an equivalent operation. After all, your loop is equal to temp = pow(rands[-1], 256) * temp. Have you compared the report from the OpenCL compiler and the final area usage to see how big their difference is?