What is the way to lead the OpenCL SDK compiler to reduce kernel logic utilization ?

Honored Contributor

12 years ago

I suspect what happened is that num_share_resources found a small candidate for logic sharing but the change in resources was so minor that in the overall design it doesn't make much of a difference. In general any time you share hardware there is a small logic penalty to implement the sharing logic so if that sharing logic has the same footprint as the logic being shared itself then you could run into results like you have seen.

Are you declaring a reqd_work_group_size or max_work_group_size attribute by any chance? If not I would considering using one of them if possible since you can typically save resources when using them because the kernel hardware will be tailored to what you need. If possible I would use reqd_work_group_size since that will result in the smallest and fastest hardware possible because the hardware will only need to handle a single work group size. Some applications only know the amount of work at runtime but in those cases you can often use reqd_work_group_size and just pad/discard unneed results.

Forum Discussion

What is the way to lead the OpenCL SDK compiler to reduce kernel logic utilization ?

Recent Discussions

Using Quartus with softHSM

The quartus license works with version 25.0 but not with version 17.0

Quartus did not start

Docker image for Quartus Pro 26.1 missing ?

Timing analysis - long combinational path