Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
10 years ago

Question about kernel vectorization

Hi all,

I am trying to use SIMD optimization on a simple vector copy kernel like A = b (both vectors are in global memory).

What I found is that when I use SIMD(4)/SIMD(8), the efficient global memory will be increased to 4.3X/8.4X compared with non-optimized codes.

But I think in ideal case the improvement will be limited to 4/8 when using SIMD(4)/SIMD(8).

Then why the actual improvement I got exceeded the theoretical ideal case?

Any suggestion is appreciated. Thanks.
No RepliesBe the first to reply