Error during fpga compilation on stratix10

Question

I was trying to compile my design on fpga. It was working on the emulation mode but during the FPGA compilation of the dpcpp using batch mode It was showing this error. 
```
/usr/sbin/kill-illegit-procs: line 6: /etc/kill-illegit-procs.cfg: No such file or directory/usr/sbin/kill-illegit-procs: line 108: /dev/shm/kip-procs-1623879899.tmp: No such file or directory/usr/sbin/kill-illegit-procs: line 108: /dev/shm/kip-procs-1623879899.tmp: No such file or directory/usr/sbin/kill-illegit-procs: line 108: /dev/shm/kip-procs-1623879959.tmp: No such file or directory/usr/sbin/kill-illegit-procs: line 108: /dev/shm/kip-procs-1623880019.tmp: No such file or directory/usr/sbin/kill-illegit-procs: line 108: /dev/shm/kip-procs-1623880079.tmp: No such file or directory
```
To compile the code I used qsub -l nodes=1:stratix10:ppn=2 -d . batch.sh -l walltime=23:10:40

where batch.sh contains  
#!/bin/bashecho "Running hardware compilation"
dpcpp -fintelfpga -c harris.cpp -o harris.o -DFPGA dpcpp -fintelfpga harris.o -o harris.fpga -Xshardware
echo "Compilation finished "

boonbengt_altera · Answer

Hi @TanmayKhabia,

Thank you for posting in Intel community forum, hope all is well and apologies for the delayed in response.
Can you share with us which design example that you tried to compile on?
And for the error mention, which environment are you compiling on?
Hope to hear from you soon.

Best WishesBB

tanmaykhabia · Answer

Sorry for the late reply. 
The design I was trying to run is attached below.

#include &lt;CL/sycl.hpp&gt;#include &lt;cmath&gt;#include &lt;cstdint&gt;#include &lt;cstdio&gt;#include &lt;cstdlib&gt;#include &lt;cstring&gt;
#define STB_IMAGE_WRITE_IMPLEMENTATION#include "stb_image_write.h"
#define STB_IMAGE_IMPLEMENTATION#include "stb_image.h"
#include &lt;CL/sycl/INTEL/fpga_extensions.hpp&gt;#include &lt;chrono&gt;#include &lt;fstream&gt;#include &lt;iostream&gt;
// using namespace std;using namespace sycl;
float luminance(uint8_t r, uint8_t g, uint8_t b){float r_lin = static_cast&lt;float&gt;(r) / 255;float g_lin = static_cast&lt;float&gt;(g) / 255;float b_lin = static_cast&lt;float&gt;(b) / 255;
// Perceptual luminance (CIE 1931)return 0.2126f * r_lin + 0.7152 * g_lin + 0.0722 * b_lin;}
class grayscale;class dxxgenerate;class sumgenerate;class cornerd;
int main(){// auto property_list = sycl::property_list{sycl::property::queue::enable_profiling()};int channels;int width;int height;float thresh = 100;uint8_t *image = stbi_load("./1.jpg", &amp;width, &amp;height, &amp;channels, 3);auto start = std::chrono::high_resolution_clock::now();{#if defined(FPGA_EMULATOR)INTEL::fpga_emulator_selector device_selector;#elif defined(CPU_HOST)host_selector device_selector;#elseINTEL::fpga_selector device_selector;#endifbuffer&lt;uint8_t, 1&gt; image_buffer{image, width * height * channels};buffer&lt;float, 1&gt; greyscale_buffer{width * height};queue queue(device_selector);queue.submit([&amp;greyscale_buffer, &amp;image_buffer, width, height](handler &amp;h){// A discard_write is a write access that doesn't need to preserve existing// memory contentsauto data = greyscale_buffer.get_access&lt;access::mode::discard_write&gt;(h);auto image_data = image_buffer.get_access&lt;access::mode::read&gt;(h);
h.parallel_for&lt;class grayscale&gt;(range&lt;1&gt;(width * height),[image_data, data](id&lt;1&gt; idx){int offset = 3 * idx[0];data[idx[0]] = luminance(image_data[offset],image_data[offset + 1],image_data[offset + 2]);});});buffer&lt;float, 1&gt; dx{width * height};buffer&lt;float, 1&gt; dy{width * height};
buffer&lt;float, 1&gt; sxx{width * height};buffer&lt;float, 1&gt; syy{width * height};buffer&lt;float, 1&gt; sxy{width * height};uint8_t *out = reinterpret_cast&lt;uint8_t *&gt;(malloc_shared(width * height, queue));
{buffer&lt;float, 1&gt; dy_tmp{width * height};
queue.submit([&amp;greyscale_buffer, &amp;dy_tmp, width, height](handler &amp;h){//h.depends_on(gray);auto data = greyscale_buffer.get_access&lt;access::mode::read&gt;(h);auto out = dy_tmp.get_access&lt;access::mode::discard_write&gt;(h);
// Create a scratch buffer for the intermediate computationh.parallel_for(range&lt;2&gt;(width, height),[data, width, out](id&lt;2&gt; idx){// Convolve horizontallyint offset = idx[1] * width + idx[0];float left = idx[0] == 0 ? 0 : data[offset - 1];float right = idx[0] == width - 1 ? 0 : data[offset + 1];float center = data[offset];out[offset] = left + 2 * center + right;});});
queue.submit([&amp;dy, &amp;dy_tmp, width, height](handler &amp;h){auto data = dy_tmp.get_access&lt;access::mode::read&gt;(h);auto out = dy.get_access&lt;access::mode::discard_write&gt;(h);h.parallel_for(range&lt;2&gt;(width, height),[data, width, height, out](id&lt;2&gt; idx){// Convolve verticallyint offset = idx[1] * width + idx[0];float up = idx[1] == 0 ? 0 : data[offset - width];float down = idx[1] == height - 1 ? 0 : data[offset + width];out[offset] = up - down;});});}
{buffer&lt;float, 1&gt; dx_tmp{width * height};
// Extract a 3x1 window around (x, y) and compute the dot product// between the window and the kernel [1, 0, -1]queue.submit([&amp;greyscale_buffer, &amp;dx_tmp, width, height](handler &amp;h){//h.depends_on(gray);auto data = greyscale_buffer.get_access&lt;access::mode::read&gt;(h);auto out = dx_tmp.get_access&lt;access::mode::discard_write&gt;(h);
h.parallel_for(range&lt;2&gt;(width, height),[data, width, out](id&lt;2&gt; idx){int offset = idx[1] * width + idx[0];float left = idx[0] == 0 ? 0 : data[offset - 1];float right = idx[0] == width - 1 ? 0 : data[offset + 1];out[offset] = left - right;});});
// Extract a 1x3 window around (x, y) and compute the dot product// between the window and the kernel [1, 2, 1]queue.submit([&amp;dx, &amp;dx_tmp, width, height](handler &amp;h){auto data = dx_tmp.get_access&lt;access::mode::read&gt;(h);auto out = dx.get_access&lt;access::mode::discard_write&gt;(h);h.parallel_for(range&lt;2&gt;(width, height),[data, width, height, out](id&lt;2&gt; idx){// Convolve verticallyint offset = idx[1] * width + idx[0];float up = idx[1] == 0 ? 0 : data[offset - width];float down = idx[1] == height - 1 ? 0 : data[offset + width];float center = data[offset];out[offset] = up + 2 * center + down;});});}
{buffer&lt;float, 1&gt; dxx{width * height};buffer&lt;float, 1&gt; dyy{width * height};buffer&lt;float, 1&gt; dxy{width * height};queue.submit([&amp;dxx, &amp;dxy, &amp;dyy, &amp;dx, &amp;dy, width, height](handler &amp;h){auto ixx = dxx.get_access&lt;access::mode::discard_write&gt;(h);auto ixy = dxy.get_access&lt;access::mode::discard_write&gt;(h);auto iyy = dyy.get_access&lt;access::mode::discard_write&gt;(h);auto ix = dx.get_access&lt;access::mode::read&gt;(h);auto iy = dy.get_access&lt;access::mode::read&gt;(h);h.parallel_for&lt;class dxxgenerate&gt;(range&lt;2&gt;(width, height),[ixx, ixy, iyy, ix, iy, width](id&lt;2&gt; i){int offset = i[1] * width + i[0];ixx[offset] = ix[offset] * ix[offset];iyy[offset] = iy[offset] * iy[offset];ixy[offset] = ix[offset] * iy[offset];});});
queue.submit([&amp;dxx, &amp;dxy, &amp;dyy, &amp;sxx, &amp;sxy, &amp;syy, width, height](handler &amp;h){auto ixx = dxx.get_access&lt;access::mode::read&gt;(h);auto ixy = dxy.get_access&lt;access::mode::read&gt;(h);auto iyy = dyy.get_access&lt;access::mode::read&gt;(h);auto sixx = sxx.get_access&lt;access::mode::write&gt;(h);auto sixy = sxy.get_access&lt;access::mode::write&gt;(h);auto siyy = syy.get_access&lt;access::mode::write&gt;(h);// assuming kernal is of size 3h.parallel_for&lt;class sumgenerate&gt;(range&lt;1&gt;(width * height),[ixx, ixy, iyy, sixx, sixy, siyy, width, height](id&lt;1&gt; ind){int i = ind[0];sixx[i] = 0;sixy[i] = 0;siyy[i] = 0;
for (int k = -1; k&lt; 2 &amp;&amp; i/width +k &lt;height ; k++){for (int j = -1; j&lt; 2 &amp;&amp; i%width + j&lt; width; j++){if (i%width + j &lt; 0 || i/width +k &lt; 0 ||i + j + k * width &lt; 0 || i + j + k * width &gt;= width * height){continue ; }sixx[i] += ixx[i + j + k * width];sixy[i] += ixy[i + j + k * width];siyy[i] += iyy[i + j + k * width];}}});});}
queue.submit([&amp;sxx, &amp;sxy, &amp;syy, width, height, out, thresh](handler &amp;h){auto sixx = sxx.get_access&lt;access::mode::read&gt;(h);auto sixy = sxy.get_access&lt;access::mode::read&gt;(h);auto siyy = syy.get_access&lt;access::mode::read&gt;(h);
h.parallel_for&lt;class cornerd&gt;(range&lt;2&gt;(width, height),[sixx, sixy, siyy, width, thresh, out](id&lt;2&gt; idx){int offset = idx[1] * width + idx[0];out[offset] = sixx[offset] * siyy[offset] - sixy[offset] * sixy[offset] - 0.04 * (sixx[offset] + siyy[offset]) * (sixx[offset] + siyy[offset]) &gt; thresh ? 255:0;});});queue.wait();std::cout &lt;&lt; "Time taken " &lt;&lt; std::chrono::duration_cast&lt;std::chrono::milliseconds&gt;(std::chrono::high_resolution_clock::now() - start).count() &lt;&lt; "
";stbi_write_png("./corners.png", width, height, 1, out, width);
stbi_image_free(image);sycl::free(out, queue);}}

I have tested this code in emulation mode for and errors and it is working fine but when I try to compile the FPGA executable that is generated is not running. I compiled it using batch mode using the earlier mentioned. The executable was run on Stratix 10 - OneAPI, OpenVINO env.
The error is given below
terminate called after throwing an instance of 'cl::sycl::runtime_error'what(): Native API failed. Native API returns: -50 (CL_INVALID_ARG_VALUE) -50 (CL_INVALID_ARG_VALUE)

boonbengt_altera · Answer

Hi @TanmayKhabia,

Apologies for the delay, based on the qsub command, I am assuming that you are using devcloud for the compilation.
Sometimes submit a batch job command from the headnodes might cause some compilation error due to missing component/ files or corrupted nodes.
Hence would suggest to use the devcloud_login, select the appropriate nodes desire and try to compile from there.
Hope that helps.

Best WishesBB

boonbengt_altera · Answer

Hi @TanmayKhabia,

Good day, just checking in to see if there is any further doubts in regards to this matter.Hope we have clarify your doubts.

Best WishesBB

boonbengt_altera · Answer

Hi @TanmayKhabia,

Greetings, as we do not receive any further clarification on what is provided, we would assume challenge are resolved. Hence thread will no longer monitor this thread. For new queries, please feel free to open a new thread and we will be right with you. Pleasure having you here.

Best WishesBB

Forum Discussion

Error during fpga compilation on stratix10

6 Replies

Recent Discussions

Agilex 7 FPGA Starter Kit with oneAPI Toolkit flow not detected over PCIe

MCTP over PCIe VDM routing to PMCI in OFS N6000 FIM configuration and datapath clarification

HLS Compiler 24.1 error - aocl-clang.exe - dll entry point not found

Error faced while executing on Agilex FPGA board....

AI Suite System Throughput Issue