Forum Discussion
Altera_Forum
Honored Contributor
12 years agoI am running the above OpenCL code on Terasic DE5-Net for testing the atomic operations on FPGA, the local/global NDRange size is 128, num_elem is 512, and only one work group exists in our design.
The global_offset[0] turns out to be 508 (should be 512). However, the same code works well on GPU. About the lines "int addr = atomic_add(&global_offset[0], 1); global_offset[addr + 1] = i" : I want to keep the atomic_add sequence of 128 work items. I do not think it is not safe since the variable "addr" is private. It is normal that the code has race conditions, and the hardware should preserve the exactness. Further, is there any rule about the atomic operations for global memory to obey for the exactness? BTY, 1, if I eliminate the " global_offset[addr + 1] = i ", the global_offset[0] is correct: 512. 2, the atomic_add on the local memory, the result is also correct. About "OpenCL handles atomics internally so it doesn't need to rely on the interconnect to provide this", if two work groups perform an atomic operation on the same global address together, who guarantee the exactness?