Forum Discussion

Wei-Chih's avatar
Wei-Chih
Icon for Occasional Contributor rankOccasional Contributor
3 years ago

How to obtain the valuse of get_local_id(0), get_group(0) and get_local_range(0) in single_task

Hi support team

I am modifying the code for running FPGA hardware on oneapi devcloud. The original sample code is for GPU/CPU which uses parallel_for lambda function for kernel, but for FPGA optimization reason, i think it should be modified to single_task lambda function for kernel. However, I have no idea how to pass nd_item<1> to h.single_task lambda function.

I need to use nd_item<1> class, but it seems h.single_task cannot pass parameter. So how can i modify it? i need to get the values of get_local_id(0), get_group(0) and get_local_range(0)

fragments of my code:

h.parallel_for<class bude_kernel>(nd_range<1>(global, wgSize), [=](nd_item<1> item) {

const size_t lid = item.get_local_id(0);
const size_t gid = item.get_group(0);
const size_t lrange = item.get_local_range(0);

float etot[NUM_TD_PER_THREAD];
cl::sycl::float3 lpos[NUM_TD_PER_THREAD];
cl::sycl::float4 transform[NUM_TD_PER_THREAD][3];

size_t ix = gid * lrange * NUM_TD_PER_THREAD + lid;
ix = ix < nposes ? ix : nposes - NUM_TD_PER_THREAD;

.

.

.

.

.

14 Replies

  • Wei-Chih's avatar
    Wei-Chih
    Icon for Occasional Contributor rankOccasional Contributor

    Thanks I will try to modify the code.

    Besides Unroll, what else could I do to optimize this code? any suggestion?

  • Hi @Wei-Chih,


    As every application are coded to execute differently, hence it would be best to go through the optimization report or profiler to which will gives more accurate insights on what to change and you can refer to the optimization guide.


    With no further clarification on this thread, it will be transitioned to community support for further help on doubts in this thread and no longer monitor this thread.

    Thank you for the questions and as always pleasure having you here.


    Best Wishes

    BB