Forum Discussion
Can you post the snippet of the report that mentions why the II has been increased to 6? I am not convinced your issue is because of the loop-carried dependency. I believe your issue is because of load/store dependency caused by latency of accessing Block RAM-based buffers which cannot be ignored with ivdep.
You will probably also be better off recovering "i" and "j" in the fused loop as follows, rather than by using "div":
i++;
if (i == ITEM_LENGTH)
{
i = 0;
j++;
}
Note that there is no need to reset "j" here since the loop will exist after ITEM_LENGTH * GROUP_SIZE iterations anyway.
Hi @HRZ
Thank you for your time helping me to find a solution to this problem - it is very much appreciated.
The reason for the II of 6 is not a mystery - it is as you say a (false) LD/ST memory dependency. If I compile the last example with just one of the loops the II is scheduled at 1 and the code gives the correct answer. So the compiler is able to respect the ivdep in the face of the LD/ST dependency. So why can't it do the same on the fused loop?
On the subject of the i,j recovery you are right this could be done differently (as it is in the full design) but it is not relevant to the issue being explored here.