Data level parallelism on FPGA with kernel replication using oneAPI

Hi, I'm playing with kernel replication on FPGA using oneAPI. There is a tutorial on kernel replication here, but it is exploiting pipeline parallelism whereas I want to exploit data-level parallel...

aikeu
3 years ago
Hi asenjo,

Sorry for late reply, I managed to consult one of my respective team member into your question. Based on your written code, the buffers go out of scope at the end of the VectorAdd() function and the kernels get serialized instead in running in parallel.

The main() would look something like this:
1. buffer a_buf1{a_vector.begin()+begin1, a_vector.begin()+end1};
2. buffer b_buf1{b_vector.begin()+begin1, b_vector.begin()+end1};
3. buffer sum_buf1{sum_parallel.begin()+begin1, sum_parallel.begin()+end1};
4.
5. buffer a_buf2{a_vector.begin()+begin1, a_vector.begin()+end2};
6. buffer b_buf2{b_vector.begin()+begin1, b_vector.begin()+end2};
7. buffer sum_buf2{sum_parallel.begin()+begin1, sum_parallel.begin()+end2};
8.
9. auto e0 = VectorAdd<true,0,2,4>(q, a_buf1, b_buf1, sum_buf1);
10. auto e1 = VectorAdd<true,1,2,4>(q, a_buf2, b_buf2, sum_buf2);
11. q.wait();

Another option is to use sub buffers:
https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sub.section.memmodel.app:~:text=A%20buffer%20created%20from%20a%20range%20of%20an%20existing%20buffer%20is%20called%20a

Another option is to use USM but but then the user is responsible to copy data back and forth themselves:
https://www.intel.com/content/www/us/en/developer/articles/code-sample/vector-add.html

Thanks.
Regards,
Aik Eu

Forum Discussion

Data level parallelism on FPGA with kernel replication using oneAPI

Recent Discussions

Agilex 7 FPGA Starter Kit with oneAPI Toolkit flow not detected over PCIe

MCTP over PCIe VDM routing to PMCI in OFS N6000 FIM configuration and datapath clarification

HLS Compiler 24.1 error - aocl-clang.exe - dll entry point not found

Error faced while executing on Agilex FPGA board....

AI Suite System Throughput Issue