Forum Discussion
MGRAV
New Contributor
4 years agoHi @Mickleman,
I am not sure but I assume that is the way you get you i and j out of the division and the modulo.
I imagine you rewrite as follow (that do basically the same, without branching)
uint16_t* c=(uint16_t*)a
bool test=(k<GROUP_SIZE) ;
int i=k / GROUP_SIZE;
c[k]= (c[k-1]+i)*(!test)+(test)* (k-i*GROUP_SIZE);
you can get that the compiler don't see the opportunity over the GROUP_SIZE parallelization.
I think to get it automatically you should permute you array, a[j][i] ==> a[i][j] so the dependency inside look like c[k-GROUP_SIZE].
I don't know if I am clear in what I mean