Altera_Forum
Honored Contributor
10 years agoPinned Memory and Host-Device Communication
Hello,
I'm experimenting with OpenCL(1.2) using a Bittware S5PHQ board. I've a task-based kernel that operates on a full input dataset (composed of a sequence of discrete sub datasets, if you will), provided via a 'global' memory buffer. The kernel also receives a 'global' results buffer to populate. The nature of the data is such that sequential processing is necessary and the results of processing each discrete data subset are of interest to the host. For reasons of efficiency, I can't start/stop a kernel instance for each discrete input sub dataset; hence a single kernel that runs through the full dataset. I've 2 related scenarios that I'd appreciate some clarifications on - 1) I want the host to be able to see the results as and when they're generated rather than wait for the kernel to finish processing the full dataset. I looked at using MEM_ALLOC_HOST_PTR+enqueueMapBuffer for this and allocated the output buffer with MEM_ALLOC_HOST_PTR. The kernel populates this buffer with each sub dataset's processing result. I expected the host to be able to map this pinned memory periodically to get the results. However, after starting the kernel, when the host makes a call to enqueueMapBuffer(cl_buf_output, CL_TRUE, ...), the call waits indefinitely, i.e. the mapping doesn't complete and/or the host doesn't get a chance to read the mapped memory. However, the same code works on the CPU with Intel OpenCL support (i.e. the host detects the 'CPU' device and runs the kernel on the CPU). The kernel and the host share a 'last result index' and every time the kernel has processed a sub dataset and generated a new result, the host can see the new result using this index from the output buffer. So, why am I not observing this behavior on the FPGA? I thought the host can, at any point, map pinned memory (allocated with MEM_ALLOC_HOST_PTR) and expect the latest data to be transferred from the device? Short of waiting for the kernel to finish, how can the host 'see' the results or any changes to pinned memory using the FPGA? 2) The 2nd scenario is related to above. I want the kernel to be able to see a change from the host. At any point of time, I'd like to be able to shutdown the kernel by setting a value in pinned memory, which the kernel checks every so often(like a shutdown flag). I expected the kernel to be able to see changes made by the host to that pinned memory. However, my observation is similar as above i.e. once the kernel is running, when the host tries to map the pinned memory to set it to the new value, the enqueueMapBuffer() call hangs indefinitely. Again, as in the 1st scenario, this works successfully on the CPU with Intel OpenCL. Why does this work with the CPU and not on the FPGA? Is what I'm looking for the same as shared memory (an OpenCL 2.0 feature)? Apologies for the long post. Thanks much in advance for any help.