for loop pipelined with NDRange

Altera_Forum

Honored Contributor

8 years ago

When I use NDRange kernel,

In Kernel code if I have get_global_id

then report will show it's a NDRange kernel, so won't be pipelined.

is there any way to pipelined NDRange kernel? using compute unit will help?

and How can I measure stall time?

I have a (8,256,256) NDRange kernel, and local group size (1,64,64)

but the computation performance is very low when I try to increase local size more.

I think it's because in 64x64 work items, each work item have to wait until all work items in same group finished, then the other 64x64 work items can be launched.

Is that correct? and how to measure the time they wait?

Forum Discussion

Recent Discussions

Regarding the issue of UFM not starting

ram retiming

Reset Release IP for Agilex needs Stratix 10 device files installed!

Licensing ‘Know-How’ Guide

Timing analysis - long combinational path