Forum Discussion

JButt5's avatar
JButt5
Icon for New Contributor rankNew Contributor
7 years ago
Solved

Getting a lot of "HAL Kern Error: Read failed from addr x, read y expected z" errors – what to do?

I compiled an application that works flawlessly on a GPU based system for a Cyclone V SoC today. It contains 3 CL Kernels, two of them are single-work-item kernels, one is an ND-Range kernel. Two command queues are used to launch them, sometimes in parallel. The kernels compile successfully after waiting approx 1 1/2 hours, however, my application doesn't produce usable results and I can see a lot of prints like that:

Kernel launch requested when kernel not idle on accelerator 0
   kernel physical id = 0
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
Kernel launch requested when kernel not idle on accelerator 0
   kernel physical id = 0
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
CL_INVALID_WORK_GROUP_SIZE
HAL Kern Error: Write failed to addr 1200 with value 0, wrote -1233682192 expected 4
HAL Kern Error: Write failed to addr 1200 with value 0, wrote -1233682192 expected 4
HAL Kern Error: Write failed to addr 1200 with value 0, wrote -1233682192 expected 4
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
HAL Kern Error: Write failed to addr 1200 with value 0, wrote -1233682192 expected 4
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
HAL Kern Error: Write failed to addr 1200 with value 0, wrote -1233682192 expected 4
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
HAL Kern Error: Read failed from addr 1200, read -1233682192 expected 4
Kernel launch requested when kernel not idle on accelerator 0
   kernel physical id = 0

Note the CL_INVALID_WORK_GROUP_SIZE error in between – this one is a bit surprising, as I launched a totally unoptimized version without any required work group size for the ND-range kernel specified (re-compiling with reqd_work_group_size right now, however as mentioned above this might take some time to complete)

When quitting the application, I furthermore get a segmentation fault, the remote GDB says:

Thread 6 "NameOfMyApplication" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1021.1034]
0xb66d2648 in acl_kernel_if_read () from target:/root/opencl_arm32_rte/host/arm32/lib/libalteracl.so
 
Stacktrace:
acl_kernel_if_read 0x00000000b66d2648
talk_to_hal 0x00000000b66d4b3c
device_handler_thread_main 0x00000000b66d4d2a
start_thread 0x00000000b5d1e3b4
<unknown> 0x00000000b5c39e18

I have no idea what is happening, especially due to the closed source nature of your SDK it's difficult to understand what goes on under the hood and those error prints are not really helpful for me. Can you shed some light on what might be happening here?

Im on the 18.1 SDK

14 Replies