OpenCL error on code compilation

Occasional Contributor

7 years ago

Thank you for the information and calculation about accessing the global memory. In this way I can estimate the latency communication, considering the PCIe interface between host and device.

I will try to force the maximux frequency with -fmax option and I will see if the performance are better. I asked you how I can improve performances since my kernel lasts 600 us (according to the profiler) and I would like to reach hundreds of nanoseconds of processing, if possible.

About the different output, I followed your suggestion about the pragma ivdep but the area usage still was too much. So I tried to remove dependencies and I reached 50% of area only optimizing the loops, without using the ivdep pragma. The accesses to the global memory are only for reading and writing (plese see the code below).

Is there anyway a race condition in the global memory access? In case of bug compiler, could you please tell me how I can fix this?

reading:
#pragma unroll
for (ushort i = 0; i < 512; i++)
      data[i] = x[i];
 
writing:
#pragma unroll
for (ushort i = 0; i < 512; i++) 
      y[i] = data[i];

Thank you.

Forum Discussion

Recent Discussions

Quartus Prime Download

Issue with MIPI CSI2 Encrypted IP File During Project Integration

How can I use Quartus Pro 25.1 sopc-create-header-files tool to generate a jtag master header file?

Please, do not release another version of Quartus until you fix the file length problem in windows

Questa - license problem