Forum Discussion

Occasional Contributor

6 years ago

How to add the number of work items in flight for the NDRange kernel?

Hi, Since the NDRange is implemented as work item based pipeline on FPGA, if I understand it correctly, the maximum number of work items in flight should be determined by the complexity (or stage) o...

hiratz

Occasional Contributor

6 years ago

Thank you, HRZ.

Actually I did not compile this example code. I just read the description about how hardware pipeline stages are generated for a given kernel code in Intel's "Best Practices Guide". The guide provides many similar but simple examples to help people understand how the pipeline parallelism can be got.

I'm still curious why only the single statement "c[gid] = a[gid]+b[gid];" can get a pipeline depth of the order of 50 - 200 stages by the compiler. It seems that the guide does not mention such implicit stages. Would you like to provide more details?

HRZ

Frequent Contributor

6 years ago

Latency of most operations on the FPGA is higher than one cycle to allow reasonable operating frequency. For the particular case of external memory accesses, the latency is in the order of a few hundred cycles. Generally the compiler generates a deep-enough pipeline to be able to absorb the majority of the external memory stalls and at the same time accommodate all the necessary operations in the pipeline targeting a specific operating frequency (240 MHz by default). If you check the "System viewer" tab of the HTML report, you can find the latency of each block in your code and calculate the total pipeline depth by adding up all the latency values.

Forum Discussion

How to add the number of work items in flight for the NDRange kernel?

Recent Discussions

Error faced while executing on Agilex FPGA board....

AI Suite System Throughput Issue

Agilex 7 I-Series "aocl diagnose acl0" error following OFS

HLS Compiler 24.1 error - aocl-clang.exe - dll entry point not found

How Do I get the License for HLS?