Forum Discussion
Altera_Forum
Honored Contributor
8 years agoLoop unrolling will deepen and widen the pipeline to allow multiple iterations to be handled simultaneously; of course this is only possible if there are no iteration dependencies in the loop. You only can fully unroll loops that have bounds that are known at compile time. The compiler will just create the necessary logic to handle all the unrolled iterations simultaneously and that is it. If you have an outer loop in this case, the iterations of the outer loop will still be pipelined. If, however, you just have one loop in your kernel which is fully-unrolled, then the pipeline depth will only be traversed once and execution finishes.
I do not work for Intel, I just have been working with this compiler for over 2 years, so I have got some basic idea about what is actually happening. Other than Altera's existing documents, there aren't any other solid and accurate documents. If you have access to paper publishers like IEEE/ACM/etc., there are a lot of papers to read on this subject, though. After two years, there are still a lot of stuff that even I don't fully understand; there isn't much else you can do when you are working with a closed-source commercial compiler.