Intel FPGA SDK for OpenCL: Issue while launching same opencl kernel multiple times
Hi,
I am using Intel FPGA SDK for OpenCL to perform matrix multiplication on DE1-SoC board. As per my requirement I have to perform this multiplication multiple times and hence iterating over a loop to enqueue the kernel. The first kernel successfully completes however the second kernel stuck in CL_RUNNING state indefinitely. I tried simplifying my code to narrow down the problem and removed all computatons from the kernel as below-
__kernel void multiplication()
{
//Empty kernel
}
instead of loop I am equeueing my kernel 2 times as below-
cl_int err;
size_t global_work_size[] = {static_cast<size_t>(1)};
size_t local_work_size[] = {static_cast<size_t>(1)};
cl_event kernel_event1;
// Enqueue the kernel for execution
std::cout << "started enqueue" << std::endl;
err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global_work_size, local_work_size, 0, NULL, &kernel_event1);
if (err != CL_SUCCESS)
{
std::cout << "Failed to enqueue"<< std::endl;
}
else{
std::cout << "Done enqueue"<< std::endl;
}
err = clWaitForEvents(1, &kernel_event1);
if (err != CL_SUCCESS) {
std::cerr << "Error waiting for kernel event." << std::endl;
}else{
std::cout << "done executing the kernel" << std::endl;
clReleaseEvent(kernel_event1);
}
//Second kernel execution
cl_event kernel_event2;
// Enqueue the kernel for execution
std::cout << "started enqueue" << std::endl;
err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global_work_size, local_work_size, 0, NULL, &kernel_event2);
if (err != CL_SUCCESS)
{
std::cout << "Failed to enqueue"<< std::endl;
}
else{
std::cout << "Done enqueue"<< std::endl;
}
cl_int event_status;
clGetEventInfo(kernel_event2, CL_EVENT_COMMAND_EXECUTION_STATUS, sizeof(event_status), &event_status, NULL);
if(event_status == CL_QUEUED){
printf("Kernel is queued.\n");
}else if(event_status == CL_SUBMITTED){
printf("Kernel is submitted.\n");
}else if(event_status == CL_RUNNING){
printf("Kernel is running.\n");
}else if(event_status == CL_COMPLETE){
printf("Kernel has completed.\n");
}else{
printf("Unknown status.\n");
}
err = clWaitForEvents(1, &kernel_event2);
if (err != CL_SUCCESS) {
std::cerr << "Error waiting for kernel event." << std::endl;
}else{
std::cout << "done executing the kernel" << std::endl;
clReleaseEvent(kernel_event2);
}
In the above simplified code, my kernel is not performing any computations and just trying to launch same kernel second time after successful completion of first one. During the execution, I can see the debug statements printed that first execution is completed and second execution is successfully enqueued but status gets printed as CL_RUNNING and waits indefinitely at clWaitForEvents and not even returning an error message for the wait status.
I'd highly appreciate if someone assist me to understand this issue.
Thank you.
Hi @chandrasekhar92,
Noted that there seems to be a custom BSP involved. Is it correct to say there is no references design you are referring to and sample code are based on self written?
After going through the situation, using clWaitForEvents to synchronize kernels in separate queues are working from previous experiences. Would recommend to try clFinish() which has similar purposes to see if that works.
Also for the mention situation, it seem that the desire execution flow are out of order, hence below links would also explain that:
Note: link mention are for example with GPU as the hardware, however the execution concept are the same.
Best Wishes
BB