Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
9 years ago

sharing of local_memory between Work Items on SoC FPGA (Cyclone V)

According to OpenCL ,

1. __local address space inside a __kernel function are allocated for each work-group executing the kernel.

2. variables that need to be allocated in local memory and are shared by all work-items of a work-group.

for the following kernel code for image of 512 *512 pixels

# define W 512# define H 512# define global_size_x 512# define global_size_y 512# define local_size_x 512# define local_size_y 1

__attribute__((reqd_work_group_size(local_size_x,local_size_y,1))) //dimensions

__kernel void sobel_kernel (__global unsigned char * restrict image_in,

__global unsigned char * restrict image_out)

{

__local int n;

int sum;

//Index of the pixel

__private short int row_id = get_global_id(1);

__private short int col_id = get_local_id(0);

sum = image_in[(row_id )*W + (col_id )] ; //read global to local

n=n+1; //update local value

barrier (CLK_GLOBAL_MEM_FENCE);

barrier(CLK_LOCAL_MEM_FENCE);

if((row_id <10) && (col_id <10))

printf("\n%d",n); //observe local value

//global mem write transaction

image_out[(row_id)*W + (col_id) ] = sum;

}

the above kernel when compiled on

1. emulator in linux (default s5_ref board) the output printed for n was n = 1 2 3 4 5.....

2. but when I compiled the kernel with Intel FPGA SDK and deployed .aocx file on the Cyclone V SoC FPGA the values for n were always 1 1 1 1 1...../

Can someone explain why the local variable declared doesn't have scope for all the work items in the work group.

and why emulator and FPGA may show different results.

Thanks
No RepliesBe the first to reply