Altera_Forum
Honored Contributor
9 years agoControlling NDRange kernel M20K RAM replication
Hello all,
Is there a way to limit the number of work groups executing simultaneously in a compute unit (NDRange kernel)? I have an issue where the compiler is replicating RAM so much that my RAM resource usage is way over 100% solely due to its crazy replication scheme (30x!). Each work item has it's own private memory so it's not replicating it for banking purposes, and the compiler reports the replication is to "efficiently support simultaneous workgroups". Does the compiler really prioritize performance over being able to build the kernel at all?? I saw an excellent post earlier about single work item kernels and# pragma max_concurrency which I didn't know about; and am hoping there's something similar for NDRange kernels. Thanks in advance.