ContributionsMost RecentMost LikesSolutionsRe: Use oneMKL with FPGA Hey, I am currently working on running a conjugate gradient solver on an FPGA I used the reference you posted as a template and succeeded implementing the CG. Unfortunately I am facing some challenges with the performance. After further investigation I have found that the vector addition is performing much slower on the FPGA compared to runs on the GPU or CPU. While the matrix multiplication and the dot product were relatively okay, the overall performance is quite disappointing. As I am not very experienced in optimizing code, I considered using oneMKL for the CG implementation to see if I can enhance the performance. However, when using oneMKL, I get the 'device not supported' error I posted already. Re: Use oneMKL with FPGA Hey sorry for my late response for the emulation I get the following message: ```oneapi::mkl::oneapi::mkl::blas::dgemv: unsupported device: Intel(R) FPGA Emulation Device``` A simple example for the FPGA Arria 10 does compile but gets a similar error: ``` terminate called after throwing an instance of 'oneapi::mkl::unsupported_device' what(): oneapi::mkl::oneapi::mkl::blas::dgemv: unsupported device: pac_a10 : Intel PAC Platform (pac_ee00000) /var/spool/torque/mom_priv/jobs/2254885.v-qsvr-1.aidevcloud.SC: line 16: 7687 Aborted ./main ``` here a simple example: ``` #include <iostream> #include <sycl.hpp> #include "oneapi/mkl.hpp" #include <sycl/ext/intel/fpga_extensions.hpp> int main (void) { size_t N = 100; #if FPGA_EMULATOR // Intel extension: FPGA emulator selector on systems without FPGA card. ::sycl::ext::intel::fpga_emulator_selector d_selector; #elif FPGA // Intel extension: FPGA selector on systems with FPGA card. ::sycl::ext::intel::fpga_selector d_selector; #else // The default device selector will select the most performant device. auto d_selector{::sycl::default_selector_v}; #endif ::sycl::queue q(d_selector); std::vector<double> A(N*N); std::vector<double> x(N); std::vector<double> y(N); for(auto it = A.begin(); it != A.end(); it++) *it = (((double) std::rand() / (double) RAND_MAX) - 0.5) * 2.0; for(auto it = x.begin(); it != x.end(); it++) *it = (((double) std::rand() / (double) RAND_MAX) - 0.5) * 2.0; for(auto it = y.begin(); it != y.end(); it++) *it = 0; for(auto it = y.begin(); it != y.end(); it++) std::cout<<*it; std::cout<<std::endl; { ::sycl::buffer<double> b_A(A.data(), N*N); ::sycl::buffer<double> b_x(x.data(), N); ::sycl::buffer<double> b_y(y.data(), N); //Additional values double alpha = 1.0; double beta = 0.0; oneapi::mkl::blas::column_major::gemv(q, oneapi::mkl::transpose::trans, N, N, alpha, b_A, N, b_x, 1, beta, b_y, 1); q.wait_and_throw(); } for(auto it = y.begin(); it != y.end(); it++) std::cout<<*it; std::cout<<std::endl; } ``` attached also the cmake file Use oneMKL with FPGA Hello everyone, I'm currently working on optimizing my code using the BLAS Level 1 and Level 2 libraries of the Intel Math Kernel Library (MKL) on an FPGA in the DevCloud environment. However, I'm running into an issue where the compilation process is taking longer than the maximum walltime of 24 hours that I'm able to set. Has anyone encountered a similar issue and found a solution to this problem? I would appreciate any suggestions or advice on how to resolve this. By the way, I attempted to run the code on the FPGA Emulator, but encountered an error message at runtime stating that the MKL library is not implemented for the FPGA Emulator, but it does compile. Thank you in advance for your help.