NDRnage Kernels Global Memory Write Pattern

Honored Contributor

8 years ago

SIMD vectorization is for the data passed into kernel, only when your input data can be vectorized should it benefit the performance.

In your code only constant M is passed in and it can't be vectorized, I would guess that's why the resource usage is the same.

If your goal is to do parallel execution like how it does on GPU, you should experiment with compute unit settings, but it's still not quite the same with GPU in some aspect.

Bottom line you can launch parallel kernels separately under different kernel name and different queue, this way it's definitely paralleled:p

Forum Discussion

NDRnage Kernels Global Memory Write Pattern

Recent Discussions

Generate Simulation Setup Script Fails

FIR IP configured for Interpolation

Altera SSLC License

Lisence issue when running .do script

How to create a Packaged Subsystem in TCL