Forum Discussion
Altera_Forum
Honored Contributor
8 years ago1) Alright, I totally understand the difference between num_simd_work_items and num_compute_units. What I don't understand is, how SIMDs are being implemented to achieve parallelism and low resource consumption. By low I mean really really low. I barely see increase or decrease in area by playing with the value of num_simd_work_items. That's why I came up with the conclusion that num_compute_units achieves real parallelism and num_simd_work_items just interleaves work item one after the other.
2) Can you elaborate more on Barrier implementation? I believe even after all workitems in the workgroup hit the barrier, then they should do their write operations one after the other. I doubt after hitting barrier all 256 workitems in the workgroup can execute their write instruction in the OpenCL code.