Kernel Execution time significantly higher than Kernel profile time
I am experiencing unexpected behavior when using clWaitForEvents or clFinish. Following is my structure of host code during kernel launch. for (i = 0; i<10;i++) { Set kernel Args ... ....