Multiple kernels and logic utilisation.

Honored Contributor

12 years ago

--- Quote Start ---

The restriction is that one .aocx file can be paired up to a cl_program. So if you want to have multiple kernel files that get swapped in and out just make sure you have multiple cl_program objects in your host code. You will still need to compile each .cl file individually using aoc.exe since there is no way to pass in multiple kernel files (you often want to compile them with different flags anyway).

--- Quote End ---

This is what I had thought was the case, just checking for any update. I was hoping to avoid multiple copies of the helper device routines. Say I have a kerel foo.cl and it uses a function in bar.cl (but bar.cl is common to many other kernels), I would like to be able to go

aoc -c foo.cl bar.cl [options]

aoc foo.aoco bar.aoco

This way I could supply the compiled modules with various compile options a let the user decied what cl_programs to create the the specific use case.

--- Quote Start ---

One word of caution is that each time the OpenCL runtime needs to swap out hardware it will copy any live buffers in the FPGA up to the host and restore them after the kernel hardware (.aocx file) has been configured into the FPGA. So if you host leaves a bunch of unused buffers in the FPGA instead of freeing them then you'll be copying data back and forth when switching between cl_program objects. Also there is an overhead for configuring the hardware as well so when you are determine which kernels go into each .cl file, think ahead about this overhead and how your host will be running the kernels and try to group kernels into the same .cl file to help minimize the overhead whenever possible.

--- Quote End ---

Yes, this is duely noted, but this is something I cannot design the "best" option as it is user application specific... Basically I am writing device code and leaving most (not all there are some frequent usages) host code to the user.

I should say more about the use case, I am part of the adimistration team for an HPC system at QUT (Australia). We naturally like the thought of low power accelerators eg. FPGAs (but we also play with GPUs and intel Xeon PHIs and of coares lots of CPUS), we've been looking for ways to make FPGAs accessible to reseachers, obviously HDL is not going to cut it with the masses. We've played with Mitrion-c, impulse-c, system-C, DIME-C, and more recently Xilinx HLS, but all of these are still far to "hardware" centric for most researchers who simply want to run simulations faster (with alower Power footprint), but providing a set of commonlg used functions I am hoping to help with tha large take up of Recpfigurable HPC at our University... So you see why predicting the best option of for a cl_program is difficult to predict in general (fine in some common cases though).

Forum Discussion

Recent Discussions

Free Licence for Max+PlusII

MAX10 ADC - getting it to simulate in Modelsim

Failed to run ip-setup-simulation:

Compile option not saved (reversed to default)

How to fix Error(23782): Failed to find an expected report