Forum Discussion
Altera_Forum
Honored Contributor
8 years agoTo put it simply (and maybe inaccurately to some extent), there is no thread-level parallelism on the FPGA unless you use SIMD. In other words, as you said, there will be only one adder on the FPGA, with work-items "looping" over that adder. Of course the adder is pipelined and hence, multiple work-items (threads) can be populating different stages of the adder at the same time, and speed-up will be achieved using pipelining instead of thread-level parallelism. With SIMD (applied to the kernel), however, there will be multiple adders and multiple work-items actually running in parallel. I recommend reading the first section of the "Intel® FPGA SDK for OpenCL Best Practices Guide" for more info on this specific subject.