Forum Discussion
Hi @Christoph9,
Thank you for posting in Intel community forum, hope all is well and apologies for the delayed in response.
If I understand the situation correctly, what are to be achieve data passing between kernel concurrently in kernels execution.
For that I would say yes, there is a way for implementing a non-blocking for the reads and writes in pipes.
This will enable prevent stalling to the kernel until the fifo buffer is free.
More details explanation can be found in our optimization guide below under section 4.3.1 pipes extension:
Readily available pipes tutorial are also available for the convenient to try on as below:
Hope that clarify.
Best Wishes
BB
- Christoph93 years ago
New Contributor
Hey BB,
thanks for your reply, no problems with the late response!
The problem is that I already use non-blocking pipes. The warnings result from reading from these pipes in e.g. a while loop:
#define RANDVEC3 \ vec3(sycl::ext::intel::pipe<rnd_out_pipe_id, float, 8>::read(), \ sycl::ext::intel::pipe<rnd_out_pipe_id, float, 8>::read(), \ sycl::ext::intel::pipe<rnd_out_pipe_id, float, 8>::read()) vec3 random_in_unit_sphere() { vec3 p; do { p = 2.0f * RANDVEC3 - vec3(1.0f, 1.0f, 1.0f); } while (p.squared_length() >= 1.0f); return p; }This results in the warning shown in the first post. The compiler seems to try to maintain the ordering of read's to the pipes by using barriers which let the kernel get stuck at runtime.
However I use these pipes in an ND-range kernel for random numbers, so the order in which the read's take place are not of any concern. I just used the pipes for random-numbers as passing the random-state itself through the various functions of the kernel and therefore its pipeline let the area-utilization explode during low-level synthesis with Quartus (the HLS expected a much lower utilization, one that should easily fit on the FPGA).
Therefore it would be great if I could diable the forcing of the read-order by the compiler or if I found another way for creating random-numbers efficiently deep inside my kernel. I did not found any articles on efficient random-number generation with DPC++ on FPGAs, so I have no ideas for finding/creating such a implementation.
Kind regards,
Christoph