Forum Discussion
Altera_Forum
Honored Contributor
8 years agoI had previously used clGetEventProfilingInfo and compared reported timing with that of a high-precision timer running on the host. The reported values matched. However, I used clFinish() in that case.
I have also used clWaitForEvents to synchronize kernels running in separate queues, seems to work fine for that purpose. You can try replacing clWaitForEvents() with clFinish() and see if you get different results. Note that there is one common mistake when using OpenCL's standard profiler. If you call a kernel in a for loop, and you do not use a separate event for each kernel call, the event will get overwritten again and again and the timing value you will read in the end will only reflect the last kernel execution.