Forum Discussion
Hi @RN1,
Thank you for posting in Intel community forum and hope all is well.
For the first part of the example, my guess is that the node using (i.e. s001-n057) does not have the required hardware. You can check the nodes spec via pbsnodes to get more information on the node.
Would recommend to use the nodes with S10 oneapi, however current there are some error going on the nodes and it has been escalated to the required team.
On the second part, there are optimization guide to programming with oneAPI as well as existing match library as below which would be a good recommended way to start with which we are looking into your mention code:
Hope that clarify.
Best Wishes
BB
- RN13 years ago
New Contributor
Thanks. Regarding the node, we achieved to compile in the s10 oneapi as you suggested, but the performance is really low compared with OpenCL.
Thanks, but we have checked previously those links, and they say nothing regarding local memory optimizations. You have the code in the previous post, and you have here the reports that we have extracted, maybe you know what to do to increase the performance since it is still quite slower compared with the OpenCL version.
I attach the captures and reports as a file.
It is compiled like this:dpcpp -fintelfpga -Xshardware -fsycl-link=early -Xsfp-relaxed=true -Xsno-interleaving=default -Xsno-interleaving=DDR -Xsno-accessor-aliasing block_matrix_mul_dpcpp.cppThanks for your time.