Forum Discussion
HRZ
Frequent Contributor
3 years agoWork-items in the same work-group won't run in parallel, they will be pipelined. You will need to use the SIMD attribute to achieve work-item-level parallelism:
Multiple work-groups are also automatically pipelined one after the other inside the same compute unit, and the compiler will replicate local memory buffers inside your kernel to accommodate for this. If you want to have work-group-level parallelism, then you need to use the num_compute_units() attribute: