Forum Discussion
Altera_Forum
Honored Contributor
8 years agoLocal memory buffers need to be replicated by the number of accesses to those buffers to allow parallel access. With unrolling, unless the accesses can be coalesced, you further increase the number of accesses to the buffer and hence, increase the replication factor. To stop the replication you should avoid unrolling the loop. Loop unrolling without replicating the local buffer will not result in any performance improvement. Check "Intel FPGA SDK for OpenCL Best Practices Guide, 1.8.5 Optimize Accesses to Local Memory by Controlling the Memory Replication Factor" and "Intel FPGA SDK for OpenCL Programming Guide, 2.2 Kernel Attributes for Configuring Local Memory System" for more info.