Forum Discussion
Altera_Forum
Honored Contributor
8 years ago --- Quote Start --- The latency of the pipeline depends on the operations that are carried out inside of the loop, and the loop exit condition. Each iteration of your loop performs one external memory read, one external memory write, multiple integer operations for addressing and calculation, and one final integer comparison for the loop exit condition. Based on the compiler's decision, these operations require a minimum of 303 clocks to be carried out, without stalling the pipeline. --- Quote End --- Ok, now i get it :) Last doubt: Is the loop condition size (in other words, the number of iterations) at the compiling of the kernel "not known" affecting somehow the performance/circuit generated ? Because, with this code im assuming that certain work-items will have more iterations than others, depending on their ID on second dimension (row=get_global_id(1)).