DevCloud: OpenCL kernels build after update but host runtime fails on fpga nodes with auto-discovery error
After the software update to the FPGA compile nodes in https://software.intel.com/en-us/forums/intel-oneapi-base-toolkit/topic/843060
I seem to once again be able to compile kernels, thanks!
However, the OpenCL runtime fails to access the pac_a10 device in the nodes with the "fpga" or "arria10" property.
UPDATE: It appears this may be specific to s001-n084, as I am able to run clinfo successfully on s001-n088 and s001-n086. There is no distinguishing property in pbsnodes to separate out the broken nodes, so a poor workaround may be to manually pick a free node and queue there IFF it functions properly.
I have tried both with my own codes and with a simple clinfo and the result is the same autodiscovery error. clinfo output below
@s001-n084:~$ clinfo
Number of platforms 3
Platform Name Intel(R) FPGA Emulation Platform for OpenCL(TM)
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 19.2
Platform Profile EMBEDDED_PROFILE
Platform Extensions cl_khr_icd cl_khr_byte_addressable_store cl_intel_fpga_host_pipe cles_khr_int64 cl_khr_il_program
Platform Extensions function suffix IntelFPGA
Platform Name Intel(R) FPGA SDK for OpenCL(TM)
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 19.3api
Platform Profile EMBEDDED_PROFILE
Platform Extensions cl_khr_byte_addressable_store cles_khr_int64 cl_khr_icd
Platform Extensions function suffix IntelFPGA
Platform Name Intel(R) OpenCL
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 2.1 LINUX
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_khr_il_program cl_intel_unified_shared_memory cl_intel_exec_by_local_thread cl_intel_vec_len_hint cl_intel_device_partition_by_names cl_khr_spir cl_khr_fp64 cl_khr_image2d_from_buffer
Platform Host timer resolution 1ns
Platform Extensions function suffix INTEL
Platform Name Intel(R) FPGA Emulation Platform for OpenCL(TM)
Number of devices 1
Device Name Intel(R) FPGA Emulation Device
Device Vendor Intel(R) Corporation
Device Vendor ID 0x1172
Device Version OpenCL 1.0
Driver Version 2019.8.10.0
Device OpenCL C Version OpenCL C 1.0
Device Type Accelerator
Device Profile EMBEDDED_PROFILE
Device Available Yes
Compiler Available Yes
Max compute units 24
Max clock frequency 3400MHz
Max work item dimensions 3
Max work item sizes 67108864x67108864x67108864
Max work group size 67108864
Preferred work group size multiple 128
Preferred / native vector sizes
char 1 / 32
short 1 / 16
int 1 / 8
long 1 / 4
half 0 / 0 (n/a)
float 1 / 8
double 1 / 4 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 64, Little-Endian
Global memory size 202518421504 (188.6GiB)
Error Correction support No
Max memory allocation 50629605376 (47.15GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 262144 (256KiB)
Global Memory cache line size 64 bytes
Image support No
Local memory type Global
Local memory size 262144 (256KiB)
Max number of constant args 480
Max constant buffer size 131072 (128KiB)
Max size of kernel argument 3840 (3.75KiB)
Queue properties
Out-of-order execution Yes
Profiling Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
IL version SPIR-V_1.0
Device Extensions cl_khr_icd cl_khr_byte_addressable_store cl_intel_fpga_host_pipe cles_khr_int64 cl_khr_il_program
Platform Name Intel(R) FPGA SDK for OpenCL(TM)
Number of devices 1
FAILED to read auto-discovery string at byte 18446744073709551615. Full auto-discovery string value is
acl_hal_mmd.cpp:1426:assert failure: Failed to initialize kernel interfaceclinfo: acl_hal_mmd.cpp:1426: int l_try_device(unsigned int, const char*, acl_system_def_t*, acl_mmd_dispatch_t*): Assertion `0' failed.
Abortedaocl diagnose thinks the BSP is installed correctly (attached as follow-up comment due to post length)
And my environment is essentially default (other than a bash function override of python-->python2) (attached as follow-up comment due to post length)