NDRnage Kernels Global Memory Write Pattern

Honored Contributor

8 years ago

1) No. You should try vectorization (num_simd_work_items) first before CU replication (num_compute_units). Both use more resources, but num_simd_work_items will use less.

3) The penalty of syncing work-items before performing a memory access is much less than constant calls to global. Again, check the optimization report and use the profiler to see the affects on your design.

4) All pipeline hardware is created with the offline compile so the choice of load/store units is done at that point as well, including whether coalescing can be performed or not.

Forum Discussion

NDRnage Kernels Global Memory Write Pattern

Recent Discussions

Generate Simulation Setup Script Fails

FIR IP configured for Interpolation

Altera SSLC License

Lisence issue when running .do script

How to create a Packaged Subsystem in TCL