Forum Discussion
To answer the original question, this feature is very useful for "out-of-core processing"; i.e. processing data that is too big to fit on the FPGA external memory but can fit on the host memory. There is a large body of work in HPC and Big Data using GPUs where overlapping/pipelining of compute and PCI-E transfer is implemented using double buffering on the GPU memory. For applications that can be "streamed", host channels on FPGAs can be used to efficiently implement out-of-core processing without the need for double-buffering. However, for applications that cannot be streamed, this feature is not applicable and double buffering will have to be used as is done on GPUs.
Could you provide a link to an example of this double buffering mechanism?
- HRZ6 years ago
Frequent Contributor
I do not know of any such example that you can directly use right now. You might be able to find something if you search in google, especially if you look for CUDA code used for out-of-core processing.