Different Kernel Executions times

Question

Hi!

I'm getting a strange thing when executing a kernel multiples times (inside a loop). The first execution always take a long time compared to the others..

Example:

Number of calls: 10

1st call: 3,2 seconds

other calls: around 0,013 seconds

The buffers and its size are always the same..

What can be happening?

altera_forum · Answer

Please post the section of your host code that measures the kernel execution times.

altera_forum · Answer

--- Quote Start ---  Please post the section of your host code that measures the kernel execution times.  --- Quote End ---    Sorry HRZ, here it is: #include "timing.h"# include &lt;Windows.h&gt;
double get_wall_time(){
    LARGE_INTEGER time,freq;
    if (!QueryPerformanceFrequency(&amp;freq)){
        //  Handle error
        return 0;
    }
    if (!QueryPerformanceCounter(&amp;time)){
        //  Handle error
        return 0;
    }
    return (double)time.QuadPart / freq.QuadPart;
}
-----------------------------------------------------------------------------
runKernerl(...){
/* Set Kernel Arguments */
	for(i=0; i &lt; num_arguments;i++)
	status = clSetKernelArg(kernel, i, sizeof(cl_mem), &amp;buffer);
	/* Run kernel the kernel */
	status = clEnqueueTask(cmdqueue,kernel,0,NULL,NULL);
	checkError(status, "Failed to launch kernel");
	/* Wait for command queue to complete pending events */
	status = clFinish(cmdqueue);
	/* Read the device output buffer to the host output array */
	checkError(status, "Failed to finish");
}
-----------------------------------------------------------------------------
ini_kernel_bi = get_wall_time();
runKernel(context, cluster_kernel, cmd_queue, 6, 0, NULL, buffers, NULL , NULL);
end_kernel_bi = get_wall_time();
printf("Time:%f", end_kernel_bi - ini_kernel_bi);

altera_forum · Answer

Try moving clSetKernelArg and checkError outside of the timing region and only time clEnqueueTask and clFinish.

You can also use OpenCL's built-in profiler that allows you to accurately measure kernel execution time, and see if you would still see any variance in the run time.

altera_forum · Answer

--- Quote Start ---

Try moving clSetKernelArg and checkError outside of the timing region and only time clEnqueueTask and clFinish.

You can also use OpenCL's built-in profiler that allows you to accurately measure kernel execution time, and see if you would still see any variance in the run time.

--- Quote End ---

Do you have any profiling tool to MS VS2012? or any reliable function to measure the executions times, i said this because im not confident about the function i encounter to measure the times.

altera_forum · Answer

The function you are using is a high-precision timer. I personally also use the same function on Windows. This function provides accurate time measurement up to a few microseconds or maybe even less.

The documentation for OpenCL's built-in profiler is here:

https://www.khronos.org/registry/opencl/sdk/1.0/docs/man/xhtml/clgeteventprofilinginfo.html

Forum Discussion

Different Kernel Executions times

7 Replies

Recent Discussions

Need a license for Encrypting - Quartus Prime Lite

Agilex 5 – Critical HSSI Error in JESD204B Example Design

recovery timing issue

Once again about CTRL+L

timing signoff