Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
10 years ago

cl-fast-relaxed-math and profiling tools

Hi,

There are two questions:

First :

In OpenCL standard it provides the cl-fast-relaxed-math to speed up and could lack of accuracy.

I test the OpenCL code with this flag on INTEL,NIVIDA and AMD platforms.

It could gain a speedup ~1x.

But I use the AOCL compiler to add cl-fast-relaxed-math while compiling the OpenCL kernel Code.

It seems that it could not gain any performance. Is the AOCL library doesn't support this flag now ?

Second :

I write a OpenCL program and the program might execute EnqueueNDRange API many time(use the for loop to enqueue repeatedly). The host only executes API and READ/WRITE buffer. Although from host executes EnqueueNDRange and READ/WRITE buffer to the FPGA receive the API signal to execute kernel code will waste 10~100ms overhead. Because there is no profiling tool to profile the detail situation. Therefore could any one help this problem ?

SDK : 14.1

platform : DE5

Thanks
No RepliesBe the first to reply