Forum Discussion
Altera_Forum
Honored Contributor
8 years agoHi,
I think clEnqueueNDRangeKernel() is a non-blocking function. But I met a more specific problem. I want to implement collaborative computing on CPU-FPGA using OpenCL. For RANSAC algorithem I used data partitioning and the process of CPU and FPGA is independent. Now I want CPU and FPGA to execute RANSAC in parallel. How can I realize it? If I use: ...... clStatus = clEnqueueNDRangeKernel(clCommandQueue,......); clFinish(clCommandQueue); cpu_thread.join(); ...... then I think, it will be blocking. Is it right? Because clFinish dose not return until all queued commands in clCommandQueue have been processed and completed. But I want CPU and FPGA to execute RANSAC in parallel, then I tried to remove the clFinish, but I got a more longer total execution time of CPU and FPGA (than using clFinish). And I also tried to use clFlush instead of clFinish, and I also got a more longer total execution time. I mean, I got the shortest execution time using clFinish but I do not know why. And How can I realize paralleled processing? Who can help me? Thanks in advance. ps: The execution time is just refers to the time of processing RANSAC, not including other time such as transmission time. The results of all the three implementations mentioned above are right and now I just focus on the time.