Forum Discussion
Altera_Forum
Honored Contributor
8 years agoMaybe a suggestion, not sure if it is an issue, but you could try using CL_MEM_COPY_HOST_PTR instead of CL_MEM_USE_HOST_PTR which will allow multiple copies of the input data to be generated for each cl_mem object rather than having them all point to the same chunk of allocated memory. Also adding the 'restrict' flag to your global variables in the kernel to let it know that no other pointers to the same data are modifying the data. You will need to do an enqueue read buffer to get the data back out since they aren't mapped.
In my experience, it looks like mapped buffers does the same thing as writeBuffers and readBuffers other than the timing when the kernel reads/writes over PCIe, but it does seem to work well to utilize pinned memory on GPUs. I am curious if there is an inherit (but unintentional) memory dependency on the global memory.