Forum Discussion
Altera_Forum
Honored Contributor
9 years agoThanks.
It seems that there is some basic behavior that I don't understand: in compile time, the compiler does not know how many instances of a kernel I would be willing to launch since this is declared only during the call of clEnqueueNDRangeKernel() with global and local sizes. The only clue the compiler has is a kernel _attribute__((reqd_work_group_size()) and this is not mandatory. Therefore I understood, the actual wiring takes place when calling clEnqueueNDRangeKernel(). I am working on an image processing project and I am concerned by the time the wiring/launching takes (fractions of a second ?) and whether the wiring will occur time after time if I make such a loop: for(i = ....) { Cpu write image to InputBuffer, clSetKernelArg(...&InputBuffer), clSetKernelArg(...&OutputBuffer), clEnqueueNDRangeKernel(... global size, local size, ...), clFinish(), clEnqueueMapBuffer(...OutputBuffer) Cpu read processed image } in such scenario, when will the actual wiring occur and will it occur only once ? Is there a better structure of program flow to meet my needs ? thanks.