[Kernel vectorization] Loads/Stores cannot be vectorized

Question

Hi guys!

The compiler says this on the above code:

Compiler Warning: Vectorized kernel contains loads/stores that cannot be vectorized. This might reduce performance.

What loads/stores are causing this?

--- Quote Start ---

__attribute__((num_simd_work_items(16)))

__attribute__((reqd_work_group_size(32,1,1)))

__kernel void int_loop(__global const short * restrict a,

__constant const bool * restrict b,

__global int * restrict group_counters,

__local int * restrict local_counter,

const int base, const int base2)

{

__private uint local_id = get_local_id(0);

__private uint group_size = get_local_size(0);

__private uint global_id = get_global_id(0);

__private int g1 = a[base + global_id];

__private int comparison;

__private int wi_counter;

comparison = (g1 == a[base2 + global_id]);

wi_counter = comparison & b[global_id] & (g1 != 0);

local_counter[local_id] = wi_counter;

for(uint stride = group_size >> 1; stride > 0; stride = stride >> 1)

{

barrier(CLK_LOCAL_MEM_FENCE);

if(local_id < stride)

local_counter[local_id] += local_counter[local_id + stride];

}

if(local_id == 0)

group_counters[get_group_id(0)] = local_counter[0];

}

--- Quote End ---

altera_forum · Answer

The likely case is the following part:  if(local_id == 0)
    group_counters = local_counter;  SIMD vectorizes operations at work-item level, and coalesces accesses that are consecutive at this level. For this particular access, since the write is done only by one work-item per work-group, it cannot be vectorized/coalesced.

altera_forum · Answer

--- Quote Start ---  The likely case is the following part:  if(local_id == 0)
    group_counters = local_counter;  SIMD vectorizes operations at work-item level, and coalesces accesses that are consecutive at this level. For this particular access, since the write is done only by one work-item per work-group, it cannot be vectorized/coalesced.  --- Quote End ---    Always right mate :)  Thanks HRZ

Forum Discussion

[Kernel vectorization] Loads/Stores cannot be vectorized

2 Replies

Recent Discussions

agilex7 ram back-annotation

altera scfifo ip with power-up initial value

Altera SSLC License

FIR IP configured for Interpolation

SSLC Login Issue – "You need to enroll" loop after OTP verification