Data transfer via channels is taking more time
Hi All,
I am working on application, where I am transferring data from one kernel to other autorun kernel. In my kernel, I am loading my global to local variables before passing via channels.
Please refer to the below piece of code
But the data transfer itself is taking approx , 8-9msec.
Can you please help in ways to optimize this.
- The data we are passing via channel is local, does that reside in local memory itself ?
- Instead of passing the data, can I pass as pointer so that autorun fetches the pointer address and starts fetching the values ?
channel char chan __attribute__((depth(1024* 10)));
__kernel void producer (__global const uint * src)
{
__local src_local[1024];
for (unsigned i = 0; i < 1024; i++)
src_local[i] = src[i];
for (int x = 0; x < iterations; x++){
for (int i = 0; i < 1024; i++)
{
write_channel_intel(chan, src_local[2*i]);
}
}
}
__attribute__((max_global_work_dim(0)))
__attribute__((autorun))
__kernel void consumer ()
{
__local dst_local[1024];
for (int x = 0; x < iterations; x++){
for (int i = 0; i < 1024; i++)
{
dst[i] = read_channel_intel(chan);
}
}
}
Thanks in advance