Forum Discussion
Hi @Wei-Chih,
Thank you for the patients, after some investigation on the mention code snippet, my guess is that below are some recommendation:
- Unroll an nested loops have some drawback instead of optimizing it as it will cause longer compile time, which I think its a major cause, hence would recommend to convert to single loop is possible. (https://www.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top/optimize-your-design/throughput-1/single-work-item-kernels/single-work-item-kernel-design-guidelines.html)
- There are also some dependencies between the data (i.e. local_forcefield) on the for and do loop which seems will loop in a large number of iteration, which might also be another cause.
- Would recommend to simplify the loop as it seems complex with multiple nested with and tightly couple, hence unrolling the loop in this condition would not help much, but instead having drawback.
Hope that clarify.
Best Wishes
BB