Forum Discussion
The number of compute units in the report for NDRange kernels is always reported as 1. That is a bug I reported to Intel a long time ago and they confirmed it. I don't think they have fixed it yet, though.
With respect to CL_DEVICE_MAX_COMPUTE_UNITS, that attribute reflects physical compute units on the target device which will always be 1 in the case of an FPGA. The compute units created using num_compute_units are logical compute units.
Finally, your code does not use any work-groups (no get_local_id()/get_group_id()) in the code and hence, it will not benefit from compute unit replication. This feature allows multiple work-groups to run in parallel but your code only uses one work-group.
Hi,
Is there any way by which we can get/request more than one physical compute unit on the underlying FPGA chip (say S10PAC)?
Thanks
- HRZ4 years ago
Frequent Contributor
What exactly are you trying to achieve by that? An FPGA design is not fixed and the underlying FPGA architecture does not have any notion of a "compute unit"; "compute unit" is simply an OpenCL terminology which doesn't necessarily map to anything meaningful on an FPGA.
You can always compile and synthesize multiple kernels into one bitstream and run them in parallel in different queues, if that is what you are trying to achieve. There are also ways to automatically create/duplicate compute units in both Single Work-item and NDRange kernels.