for loop pipelined with NDRange

Honored Contributor

8 years ago

So, by default, the compute unit is one, is that mean when I use local group size 64x64, FPGA load 64x64 work items, but won't execute at same time, they have been execute one after another, similar to a for loop, but randomly?

Does this pipeline execute work item when previous work item have finished? or they will be execute partially overlapping depends on algorithm just like pipelined a for loop?

and I am confuse about SIMD and compute unit. I have read best practice guide.

I know when set compute unit to 2, compiler will duplicate 2 kernel, so hardware memory will double. So if I want to increase parallelism, I can duplicate kernel to upper limit of FPGA.

what is different between SIMD and compute unit? Did SIMD also increase hardware memory? what if I increase SIMD too much?

Forum Discussion

Recent Discussions

Regarding the issue of UFM not starting

ram retiming

Reset Release IP for Agilex needs Stratix 10 device files installed!

Licensing ‘Know-How’ Guide

Timing analysis - long combinational path