Forum Discussion
Altera_Forum
Honored Contributor
8 years agoIncreasing the work-group size will not have much of an effect on performance because it does not increase parallelism, it might just help bring the performance of the pipeline closer to its peak performance. You should use the SIMD attribute if you want to increase parallelism. The number of compute units is by default one. You can use the num_compute_units attribute to increase it. All these attributes are described in Altera's OpenCL SDK documents.
The two kernel snippets you have used should be implemented in a very similar fashion by the compiler and give very similar performance. If the performance is 100x different, I cannot explain it.