Forum Discussion
Altera_Forum
Honored Contributor
10 years ago --- Quote Start --- My guess is that the optimizer fails to do a good job with the size of (80, 80). Can the problem possibly be simplified for powers of two? Have you tried to implement the problem as a single work item kernel? Those tend to be more efficient and the compiler is more predictable. --- Quote End --- Thanks for the reply. Actually I want to know if it is OK to construct a local memory (has the same size with work group) whose size is not powers of two. E.g when setting the SIMD as 8 for matrix XOR kernel, a 128 * 128 local memory per work group will use more than 100% memory blocks on FPGA. So I want to know if it is possible to use a 80 * 80 local memory while maintaining SIMD as 8 to utilize more memory blocks on FPGA (but less than 100%)