Forum Discussion

New Contributor

7 years ago

[FPGA SDK for OpenCL] Problem with setting multiple compute units

I have recently been trying to compile an NDRange kernel with 4 compute units (by using the num_compute_units attribute), however when I view the report it says that the number of compute units is 1....

wwood10

New Contributor

7 years ago

Thanks for the help.

I tried using get_local_id()/get_group_id() in a new design (which I have attached an image of the report for), however it still performs the same.

One strange thing I have noticed is that CL_DEVICE_MAX_WORK_ITEM_SIZES returns me (0,17,52) and CL_DEVICE_MAX_WORK_GROUP_SIZE returns me 2147483647. These number seem a bit strange to me.

For context I run the kernel with clEnqueueNDRangeKernel(queue_, kernel_, 1, NULL, gSize_, wgSize_, 0, NULL, NULL); where wgSize_[3] = {WORK_ITEM_SIZE, 1, 1} and gSize_[3] = {BUFFER_SIZE, 1, 1}. I assume I do not need to enqueue a command for each work group right?

HRZ

Frequent Contributor

7 years ago

No, you don't need a separate queue for each work-group; everything is handled automatically. How many work-groups are you using? The guides recommends at least 3x more work-groups than compute units to see a reasonable performance benefit. Furthermore, if your application is memory unfriendly (e.g random memory accesses) or one compute unit already saturates the memory bandwidth, you are not going to see any performance benefit from using multiple compute units.

Forum Discussion

[FPGA SDK for OpenCL] Problem with setting multiple compute units

Recent Discussions

Test Survey

Agilex 7 I-Series "aocl diagnose acl0" error following OFS

AI Suite - core_hw.tcl error

How does the FPGA AI Suite utilize Agilex 5 DSP Blocks?

AI Suite - Why does the Sequential IP not take a model argument?