Forum Discussion
Altera_Forum
Honored Contributor
8 years agoThanks for pointing out what to read.
So if I use the vector_Add example with eg. 1024 elements and put follwing attributes to the kernel: __attribute__((num_simd_work_items(4))) __attribute__((reqd_work_group_size(256,1,1)))__kernel void vector_add ...
there would run 4 kernels in parallel with 256 Work-Items processed in each of them? Need the simd factor * work_group_size = elements? On the intel-video guide: Writing OpenCL Programms for Intel FPGAs the executionmodel is devided into single work-item excecution (with loop-pipelining) and ndrange kernels: So if i want to use the dataparallel NDRange excecution i have to work with clenqueuendrangekernel + simd?
How can i "tell" the kernel that i want to use it as a Single Work-Item Excecution? (Video sayswith NDRange of (1,1,1), but i didn´t managed to get it work that way ) Whats the difference between the single Work-Item excecution which is set with the NDRange of (1,1,1) and the excecution from the vector_add example. Sorry for asking again, i tried to figure it out the entire day but couldn´t quite suceed ./ Thanks :)