Forum Discussion
HRZ
Frequent Contributor
6 years agoI am sure with 18.1.x, hyperflex optimization was automatically disabled with burst non-aligned LSUs. I never tested 19.0 or 19.1, but in 19.2 and above this does not seem to happen anymore. You can infer aligned coalesced ports if you avoid access coalescing using loop unrolling and instead use OpenCL vector variables or a struct with one array as its member with as many indexes as the unroll factor, so that you are technically just reading (and writing) one wide value each loop iteration. Of course this will only be feasible if all your memory accesses have a minimum alignment size equal to the width of the coalesced port (i.e. your offset in bytes should always be a multiple of the port size in bytes).