Forum Discussion
Altera_Forum
Honored Contributor
12 years agoI suspect what happened is that num_share_resources found a small candidate for logic sharing but the change in resources was so minor that in the overall design it doesn't make much of a difference. In general any time you share hardware there is a small logic penalty to implement the sharing logic so if that sharing logic has the same footprint as the logic being shared itself then you could run into results like you have seen.
Are you declaring a reqd_work_group_size or max_work_group_size attribute by any chance? If not I would considering using one of them if possible since you can typically save resources when using them because the kernel hardware will be tailored to what you need. If possible I would use reqd_work_group_size since that will result in the smallest and fastest hardware possible because the hardware will only need to handle a single work group size. Some applications only know the amount of work at runtime but in those cases you can often use reqd_work_group_size and just pad/discard unneed results.