Issues with the use of the flag "-fintelfpga"

New Contributor

5 years ago

Hi all and thanks in advance for any help that you could provide.

It seems that the problem with which I opened this thread is still unsolved, and we have just found a modification that seems to narrow down the problem.

We've been playing with the following code that is one of the examples available in the oneAPI training material:

using namespace sycl;
int main() 
{
  {
  range<1> r{SIZE};
  #ifdef FPGA_EMULATOR
  INTEL::fpga_emulator_selector device_selector;
  #else
  INTEL::fpga_selector device_selector;
  #endif
  queue q{device_selector};
  queue q_cpu{cpu_selector{}};
  buffer<int, 1> a_buf{r};
  buffer<int, 1> b_buf{r};
  buffer<int, 1> c_buf{r};
  // a ---- c --- d
  // b __/ 
  q.submit([&](handler& h) {
    accessor a(a_buf, h, write_only);
    h.parallel_for(r, [=](auto idx) {
      a[idx] = idx; }); 
  });
  q.submit([&](handler& h) {
    accessor b(b_buf, h, write_only);
    h.parallel_for(r, [=](auto idx) {
      b[idx] = -idx; }); 
  });
  q_cpu.submit([&](handler& h) { //fails with q_cpu, but not with q
    accessor a(a_buf, h, read_only);
    accessor b(b_buf, h, read_only);
    accessor c(c_buf, h, write_only);
    h.parallel_for(r, [=](auto idx) {
      c[idx] = a[idx] + b[idx]; }); 
  });
  q.submit([&](handler& h) {
    accessor c(c_buf, h, read_write);
    h.parallel_for(r, [=](auto idx) {
      c[idx] += 1; }); 
  }).wait();
  }

  std::cout << "DONE.\n";
  return 0;
}

As you can see in the comment of the 3rd kernel submission, submitting in the same code to the FPGA and the CPU at the same time and expecting the runtime to solve the data flow dependencies fails with the following message:

u32284@s001-n081:~/oneTBB/examples/SC20/lab$ dpcpp -fintelfpga vector-add-fpga.cpp -DFPGA_EMULATOR -o vadd.emu
u32284@s001-n081:~/oneTBB/examples/SC20/lab$ ./vadd.emu 
terminate called after throwing an instance of 'cl::sycl::runtime_error'
what(): Native API failed. Native API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted

If we change the second kernel so that we avoid submitting to the CPU device, the code does not return:

// change q_cpu.submit()... by this:
host_accessor a(a_buf, read_only);
host_accessor b(b_buf, read_only);
host_accessor c(c_buf, write_only);
for(int idx=0; idx<SIZE;idx++){
  c[idx] = a[idx] + b[idx]; 
}

And the only way to get it works, as far as we know, is by destroying the host_accessors:

{
  host_accessor a(a_buf, read_only);
  host_accessor b(b_buf, read_only);
  host_accessor c(c_buf, write_only);
  for(int idx=0; idx<SIZE;idx++){
    c[idx] = a[idx] + b[idx]; 
  }
}

u32284@s001-n081:~/oneTBB/examples/SC20/lab$ dpcpp -fintelfpga vector-add-fpga3.cpp -DFPGA_EMULATOR -o vadd.emu
u32284@s001-n081:~/oneTBB/examples/SC20/lab$ ./vadd.emu 
DONE.
u32284@s001-n081:~/oneTBB/examples/SC20/lab$

Has this been reported before? Are we doing something wrong or is the compiler/runtime that still needs some improvements?

Thanks once again.

Forum Discussion

Issues with the use of the flag "-fintelfpga"

Recent Discussions

Agilex 7 FPGA Starter Kit with oneAPI Toolkit flow not detected over PCIe

MCTP over PCIe VDM routing to PMCI in OFS N6000 FIM configuration and datapath clarification

HLS Compiler 24.1 error - aocl-clang.exe - dll entry point not found

Error faced while executing on Agilex FPGA board....

AI Suite System Throughput Issue