Optimization for complex algorithms

Honored Contributor

8 years ago

The multi-kernel approach could certainly help, as long as you can logically split your computation to separate sequential sections. I personally have not tried doing this yet, but you can put the different kernels in separate files and compile them individually. Then, in the host code, you will load one kernel image, compute, leave the output on device memory, load the second kernel, use the output of the first kernel as input, and generate the final output. You might need some extra buffer management in this case; e.g. you might need to free the input buffer of the first kernel after its execution has finished, to make up space for the output buffer of the second kernel.

Forum Discussion

Recent Discussions

A5EG013BB18A OPN is visible in Quartus but not listed in Program File Generator

SSLC Login Issue – "You need to enroll" loop after OTP verification

altera scfifo ip with power-up initial value

FIR IP configured for Interpolation

Recommendations for Quartus Prime File Cloud Storage