Kernel execution time
In my program the execution time of the kernel is more than the execution time of the entire program
In the output screen (Attached Screenshot) 'Execution time of kernel' is measured using 'clGetEventProfilingInfo'
and 'Execution time' is measured using clock_t, similar to this link and it measures the total time taken by the main function
This issue occurs only when I run my code on DevCloud, if I run it on my PC then Execution time > Kernel Execution time
Why is this happening?
int main() { . . . start = clock(); . . err = clEnqueueNDRangeKernel(queue, multiply_ker, 1, NULL, &global, &local, 0, NULL, &event); clWaitForEvents(1, &event); clFinish(queue); . . . end = clock(); }
It is possible that the OS is reading the CPU clock incorrectly and setting the wrong value for "CLOCKS_PER_SEC". You can try with the high-precision "clock_gettime" function to see if it makes a difference. You can find the function information here:
https://linux.die.net/man/3/clock_gettime
And an example implementation here:
https://github.com/zohourih/FPGAMemBench/blob/master/common/timer.h