Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
8 years ago

how to understand burst sizes info from profiler

hi,

The profiler is showing me the following measurements for the read on the "K_contributors" global arg:

Bandwidth: 0.1 MB/s, 100 % efficiency

Average Burst Size: 2.0

(Max Burst size: 16 )


// THIS IS A SINGLE WORK-ITEM KERNEL# define MAX_CONTRIBUTORS 8128
void Krnl_IntraE(...
         __global const char3* restrict K_contributors,
)
{
    __local char3  localcache   ;
    for (ushort i=0; i<MAX_CONTRIBUTORS; i++) {
         localcache  = K_contributors ;    
    }
...
}

As the loop-index "i" is increased consecutively, I expected larger burst sizes than 2.

is there any explanation for this?

2 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    You should unroll the loop so that the compiler would infer a wider port to memory, allowing for larger burst size. There is little to no runtime coalescing done for single work-item kernels and hence, you should not expect a large burst size without unrolling, just because the accesses are consecutive.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    1. Why Bandwidth is 0.1MB/s??Is there something wrong with profiler? I also encounter this problem in quartus 17.0

    2. I have my kernel code like

    typedef struct{

    float a[20];

    }A

    __kernel foo(__global *A data){

    A localdata[100];

    for(i=0;i<100;i++){

    localdata[i]=data[i+index];

    }

    }

    I expect every memory access will bust coalescing read global memory for 20 float, so Average Burst Size suppose larger than 1.

    but in profiler Average Burst Size shows only 4~6. how to increase my access efficiency?