Altera_Forum
Honored Contributor
9 years agosharing of local_memory between Work Items on SoC FPGA (Cyclone V)
According to OpenCL ,
1. __local address space inside a __kernel function are allocated for each work-group executing the kernel. 2. variables that need to be allocated in local memory and are shared by all work-items of a work-group. for the following kernel code for image of 512 *512 pixels # define W 512# define H 512# define global_size_x 512# define global_size_y 512# define local_size_x 512# define local_size_y 1 __attribute__((reqd_work_group_size(local_size_x,local_size_y,1))) //dimensions __kernel void sobel_kernel (__global unsigned char * restrict image_in, __global unsigned char * restrict image_out) { __local int n; int sum; //Index of the pixel __private short int row_id = get_global_id(1); __private short int col_id = get_local_id(0); sum = image_in[(row_id )*W + (col_id )] ; //read global to local n=n+1; //update local value barrier (CLK_GLOBAL_MEM_FENCE); barrier(CLK_LOCAL_MEM_FENCE); if((row_id <10) && (col_id <10)) printf("\n%d",n); //observe local value //global mem write transaction image_out[(row_id)*W + (col_id) ] = sum; } the above kernel when compiled on 1. emulator in linux (default s5_ref board) the output printed for n was n = 1 2 3 4 5..... 2. but when I compiled the kernel with Intel FPGA SDK and deployed .aocx file on the Cyclone V SoC FPGA the values for n were always 1 1 1 1 1...../ Can someone explain why the local variable declared doesn't have scope for all the work items in the work group. and why emulator and FPGA may show different results. Thanks