Forum Discussion
Altera_Forum
Honored Contributor
8 years agoIn NDRange, threads are pipelined, in single work-item, loop iterations are pipelined. The former still has a scheduler and could issue threads out-of-order, the latter does not.
In the normal case, there will be only one pipeline in an NDRange kernel, with all the threads from all work-groups being pipeline based on the order decided by the scheduler, and two threads will NEVER be issued at the same clock, even though multiple threads can be active in different stages of the pipeline at each clock. Still, it is also possible to achieve "thread-level parallelism" by employing the SIMD or kernel pipeline replication attributes; these two attributes will allow two or more threads from the same or different work-groups to be issued onto different copies of the pipeline at the same clock. Disclaimer: This is my personal understanding of how the compiler works, the reality could actually be different.