Forum Discussion
Well, whatever you see in the area report is an inaccurate estimation of resource utilization, and is just supposed to give you an idea of whether your kernel is going to fit on the FPGA or not. In fact, in Quartus v17.1 a new message has been added to the compiler's output that explicitly says so. The numbers in acl_quartus_report.txt are the final post-place-and-route area and timing results which are 100% accurate. The early estimation, which is based on some model Altera has developed, is required since the compilation and place and route might take a couple of hours and people generally cannot wait so long just to see if their kernel is going to fit on the device. However, the accuracy is low since modelling area utilization on FPGAs is not easy. There is one exception and that is DSP utilization; DSP utilization in the area report is accurate because modelling that is easy. The only possible source of discrepancy is if your BSP is reserving some DSPs but not using them, which will be counted in the area report but not the final report. In this case, the number of DSPs that are actually used will be equal to the number of DSPs reported in the area report minus the number of DSPs reserved by the BSP.
The operating frequency you see in acl_quartus_report.txt is the final real Fmax. Fmax cannot be easily estimated, hence there is no estimation in the area report (in fact, there was one in the early versions of AOC but they removed it since it was extremely inaccurate). Regarding latency, if you add up the latency of the different blocks in your kernel, you will get the minimum latency of the pipeline. The actual latency is always higher due to stalls from external memory and channel operations. Furthermore, since the latency of each block is for one loop iteration (single work-item kernels) or one work-item (NDRange kernels), if you want to get an estimation of minimum run time, you should multiply the latency of each block with its loop trip count or number of work-items that traverse it.