OpenCL compiler warning about barrier although no barrier in the code
Hi,
I am having this warning "compiler warning: limiting to 2 concurrent work_groups because threads might reach barrier out-of-order" when compiling my kernel. Here is part of my code:
__attribute__((reqd_work_group_size(10,1,1)))
__kernel void kernel1(__global int* restrict flag, __global int* restrict start, __global int* restrict end, __global volatile int* restrict producer_data){
int gid = get_global_id(0);
int sum = 0;
if(flag[gid] == 1){
for(int i = start[gid]; i< end[gid]; i++){
sum++;
producer_data[gid] = sum;
}
}
write_channel_altera(c_id, gid);
}
I know the compiler will generate this warning when there is a barrier after the for loop (Since the header of my for loop depends on global id). However, there is no barrier in my code.
Is it because OpenCL enforce a work_item order for the channel writes? I tried locks for the channel write but it didn't work. How to write the code when different work items are writing to the channel in a nondeterministic order?