Forum Discussion

Christoph9's avatar
Christoph9
Icon for New Contributor rankNew Contributor
3 years ago

Non-Ordered Pipes for Random-Number-Generation

Hello,

I currently am trying to fit a design onto the Arria10 FPGA that requires random-numbers at multiple locations in the code, sometimes in sub-sub-subroutines.

First I tried to hand the state of the random-number generator. I use the engine used by DPCT. Here my autorun kernel showing how I use it:

class random_generator_kernel_id;

struct rnd_generator {
  void operator()() const {
    auto rand_state = dpct::rng::device::rng_generator<
        oneapi::mkl::rng::device::philox4x32x10<4>>(1984, { 0, 0 * 8 });
    while (1) 
    {
        sycl::ext::intel::pipe<rnd_out_pipe_id, float, 8>::write(
            rand_state
                .generate<oneapi::mkl::rng::device::uniform<float>,
                            1>());
    }
  }
};

fpga_tools::Autorun<random_generator_kernel_id> ar_rnd_gen{ds, rnd_generator{}};

However, this resulted in huge area-utilization during the low-level synthesis (the HLS gave me low estimates in the report, which should have fit easily).

I tried many things and came up with an idea: Use a second autorun kernel (as in the autorun-tutorial in the OneAPI samples repository) to generate the random numbers and put them in a pipe (as shown above).

My other kernel (ND-Range), can then just read from the pipe at multiple locations. In the OneAPI programming guide for FPGAs is noted that no order-guarantee can be given when using pipes with ND-range kernels, but this should be fine as I just use it for random numbers.

This compiles, but throws warnings like this for pipe-reads in loops:

raytracing.dp.cpp:170: Compiler Warning: Pipe ordering required barrier insertion in for.body.i.i of pipe_ZTS18RaytracingKernelID, but kernel may hang as a result
camera.h:33: Compiler Warning: Pipe ordering required barrier insertion in do.body.i.i.i.i of pipe_ZTS18RaytracingKernelID, but kernel may hang as a result
hitable_list.h:44: Compiler Warning: Pipe ordering required barrier insertion in for.body.i.i.i.i of pipe_ZTS18RaytracingKernelID, but kernel may hang as a result
raytracing.dp.cpp:122: Compiler Warning: Pipe ordering required barrier insertion in _ZNK12hitable_list3hitERK3rayffR10hit_record.exit.i.i.i of pipe_ZTS18RaytracingKernelID, but kernel may hang as a result
material.h:103: Compiler Warning: Pipe ordering required barrier insertion in do.body.i.i.i.i.i.i of pipe_ZTS18RaytracingKernelID, but kernel may hang as a result
material.h:103: Compiler Warning: Pipe ordering required barrier insertion in do.body.i.i105.i.i.i.i of pipe_ZTS18RaytracingKernelID, but kernel may hang as a result
raytracing.dp.cpp:170: Compiler Warning: Pipe ordering required barrier insertion in for.cond.i.i.i.i.preheader.UnifiedLatchBlock.switch of pipe_ZTS18RaytracingKernelID, but kernel may hang as a result

And as the compiler said, the kernel indeed gets stuck and hangs at execution.

So now my question: How can I prevent the compiler from establishing some kind of pipe ordering in loops, as I do not need this for the random-numbers?
And if this is not possible, is there another way for efficient random-number generation on an FPGA in SYCL without needing to pass the whole random-state to subroutines causing exploding area-utilization?

Thanks in advance for any suggestions,
Christoph

14 Replies