Forum Discussion
Have you verified the functional correctness of your code using emulation? It seems the v19.3 compiler is optimizing out most of your code. However, v16.1.2 which I still use for my main development behaves differently and does not optimize out your code but it generates very helpful warnings that could help finding the problem in your code:
test.cl:28: Compiler Warning: Full unrolling of the loop is requested but the loop bounds cannot be determined. The loop is not unrolled in kernel matvec
test.cl:67: Compiler Warning: removing out-of-bounds accesses to presultEven thought v19.3 also generates the first warning, it does not generate the second one which could in fact be the source of your problem. It is possible that v19.3 is assigning a value of zero to the out-of-bound index and since you are shifting the buffer, it is assuming the whole buffer is being zeroed out and hence, it is optimizing out the computation in your kernel. Maybe @MeiYanL_Intel can elaborate why the newer versions of the compiler are excluding such critical warnings, forcing programmers to run in circles trying to discover issues in their code.
Hi,
Thank you very much for your time testing my code,
for functional correctness, the host code has a method to verify it, but apparently, that code also has some problems since it passed that test.
I got your point, and thanks, I will change the inner loop to see if the it will fix the problem,
a fast test (removing the shift register) proves your point is correct, by removing shift register, the report shows some DSP usage,
The strange thing is that even in the report there is no warning about this out-of-bound access !