Altera_Forum
Honored Contributor
10 years agoBad Performance of Host to OpenCL Memory Transfers
I'm working with the Socrates II SoC Cyclone V board and am just getting started. After trying a few basic examples I noticed very low performance for an integer division test. It was almost as slow as the ARM CPU. Taking a closer look, it turns out that the kernel actually executes very quickly, and it's the transfer from host memory buffers to OpenCL buffers that takes almost all the time. This is especially confusing, since the ARM cores and the FPGA share the same physical memory on the Socrates II. Here's some data:
Host buffer to OpenCL buffer transfer rate: ~30MB/s Memory usage/throughput of the division kernel (may not be memory bound): ~1.5GB/s Nominal memory speed: 3.2GB/s Did anyone else encounter this problem? Secondly, is it possible to set up shared memory between FPGA and the ARM cores? They share the same physical memory, but I don't know any OpenCL feature for doing such a thing. Maybe some ALTERA extension? Thanks in advance for any help.