Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
8 years ago

M20K RAM block usage question

Suppose there are 8 workgroups, each workgroup contains 8 work items.

I declare local memory in kernel function.

__local float A[1000];

if I copy data from global memory, this kind of behavior will increase M20K RAM block usages?

the total M20K is not "local memory size * workgroup number"?

A[1000] * 8

for (...)

{

A[] = data from global memory

}

Thanks

1 Reply

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Not all work-groups run fully in parallel on the FPGA. The compiler will decide how many work-groups can run in parallel. The M20K utilization will depend on the number of accesses to the buffer per work-group (which depends on the code and can also be affected by SIMD size), the number of work-groups running in parallel per compute unit (decided by the compiler), and the number of compute units (enforced by the user). The compiler report will explicitly mention why and how many times each local buffer is replicated, and how much the total size will be.