Forum Discussion
Altera_Forum
Honored Contributor
7 years agoThere is no cache consistency in Intel's basic cache implementation on FPGAs; all the caches are private per global memory access as also mentioned in the area report. Furthermore, OpenCL does not guarantee global memory consistency unless after kernel execution has finished. Hence, if you try to share a READ_WRITE global buffer between two or more kernels running in parallel, you will likely get incorrect output due to race conditions unless no two kernels ever write to or read from the same location (in which can you can just split the original shared buffer into multiple non-shared buffers). You can probably try atomic operations, but they will be extremely slow.