NDRnage Kernels Global Memory Write Pattern

Honored Contributor

8 years ago

1) On the FPGA, by default, a deep pipeline is created so work-items can go in and come out every clock cycle. Optionally, the single pipeline can be vectored to bring more work-items in simultaneously or the entire pipeline can be duplicated to handle different workgroups simultaneously.

2) Depends on the implementation. Without vectorization, a work item goes in and comes out every clock cycle. With vectorization, all 16 can be processed in parallel. The tradeoff is always performance vs. FPGA resource use.

3) You would not want to do this. Better to use a barrier to synchronize all the work-items and write all the work-item data to global memory in one shot.

4) Yes. The compiler will coalesce memory accesses where it can. If it can't, the optimization report will indicate what implementation was selected and why coalescing could not be performed.

Forum Discussion

NDRnage Kernels Global Memory Write Pattern

Recent Discussions

Generate Simulation Setup Script Fails

FIR IP configured for Interpolation

Altera SSLC License

Lisence issue when running .do script

How to create a Packaged Subsystem in TCL