Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
10 years ago

The effect of conditional branch on performance

Hi all,

According to the Altera aocl optimization guide, forwarding branch (if-then-else) will not impact the performance negatively. I am trying to see how it will affect the actual performance by using vectorAdd kernel.

The two version codes are like

version 1 (no branch):

__kernel void

__attribute((reqd_work_group_size(256,1,1)))

__attribute((num_simd_work_items(4)))

vectorAdd(__global const uint *x,

__global const uint *y,

__global uint *restrict z)

{

// get index of the work item

int index = get_global_id(0);

// add the vector elements

z[index] = x[index] + y[index];

}

version 2 (conditional branch):

__kernel void

__attribute((reqd_work_group_size(256,1,1)))

vectorAdd(__global const uint *x,

__global const uint *y,

__global uint *restrict z)

{

// get index of the work item

int index = get_local_id(0);

int block_id = get_group_id(0);

// add the vector elements

if(index < 64) {

z[4*index+256*block_id] = x[4*index+256*block_id] + y[4*index+256*block_id];

z[4*index+1+256*block_id] = x[4*index+1+256*block_id] + y[4*index+1+256*block_id];

z[4*index+2+256*block_id] = x[4*index+2+256*block_id] + y[4*index+2+256*block_id];

z[4*index+3+256*block_id] = x[4*index+3+256*block_id] + y[4*index+3+256*block_id];

}

}

I manually disable part of the work-groups by adding a condition. If an forwarding-branch do not hurt the performance then these 2 versions should lead to similar result, right?

But experiments show version 1 is much better than version 2 (5X~ faster). Then could I know if I misunderstand something about what the optimization guide saying about "forwarding branch will not impact the performance negatively"? Or there are some other tricks in the codes leading to different results?

Thanks.

2 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    What happens if you also specify the num_simd_work_items for version 2?

    --- Quote Start ---

    __attribute((num_simd_work_items(4))).

    --- Quote End ---