Forum Discussion

TSchw11's avatar
TSchw11
Icon for New Contributor rankNew Contributor
5 years ago

Getting "HAL Kern Error: Read/Write failed from addr x, read y expected z" when running multithread OpenCL application on CycloneV SoC

To ensure, that I didn't just write an incorrect program, I used the standard multithread example: https://www.intel.com/content/www/us/en/programmable/support/support-resources/design-examples/design-software/opencl/multithreaded-vector-operation.html

As there is no arm32 version of that example, I checked the difference between exm_opencl_hello_world_arm32_linux and exm_opencl_hello_world_x64_linux from https://www.intel.com/content/www/us/en/programmable/support/support-resources/design-examples/design-software/opencl/hello-world.html.

As the only difference was the Makefile, I applied the same changes to the Makefile of exm_opencl_multithread_vector_operation_x64_linux.

diff exm_opencl_hello_world_x64_linux/hello_world/Makefile exm_opencl_hello_world_arm32_linux/hello_world/Makefile 
43,44c43,44
< AOCL_COMPILE_CONFIG := $(shell aocl compile-config )
< AOCL_LINK_CONFIG := $(shell aocl link-config )
---
> AOCL_COMPILE_CONFIG := $(shell aocl compile-config --arm)
> AOCL_LINK_CONFIG := $(shell aocl link-config --arm)
53,54c53,54
< # Compiler
< CXX := g++
---
> # Compiler. ARM cross-compiler.
> CXX := arm-linux-gnueabihf-g++

Using the changed Makefile, I compiled the multithread example:

cd exm_opencl_multithread_vector_operation_x64_linux/multithread_vector_operation/
PATH="/opt/intelFPGA/19.1/embedded/ds-5/sw/gcc/bin/:$INTELFPGAOCLSDKROOT/bin:$PATH" \
	AOCL_BOARD_PACKAGE_ROOT="$INTELFPGAOCLSDKROOT/board/de10_standard" \
	VERBOSE=1 make
PATH="/opt/intelFPGA/19.1/embedded/ds-5/sw/gcc/bin/:$INTELFPGAOCLSDKROOT/bin:$PATH" \
	AOCL_BOARD_PACKAGE_ROOT="$INTELFPGAOCLSDKROOT/board/de10_standard" \
	aoc -board=de10_standard_sharedonly -v device/vector_op.cl -o bin/vector_op.aocx

Running the program sometimes succeeds, but often results in a crash:

root@socfpga:~# ./host                                                                                                                        
Initializing OpenCL
Platform: Intel(R) FPGA SDK for OpenCL(TM)
Using 1 device(s)
  de10_standard_sharedonly : Cyclone V SoC Development Kit
Using AOCX: vector_op.aocx
Reprogramming device [0] with handle 1
Thread1 created successfully
Thread2 created successfully
Instantiating a new problem with args: N=100000 kernel_name=vector_mult 
Instantiating a new problem with args: N=100000 kernel_name=vector_add 
Launching for device 0 (100000 elements)
Launching for device 0 (100000 elements)
HAL Kern Error: Read failed from addr 1080, read -1234493200 expected 4
HAL Kern Error: Read failed from addr 20, read -1234493200 expected 4
 
Time: 20.714 ms
Kernel time (device 0): 1.098 ms
HAL Kern Error: Write failed to addr 1080 with value 0, wrote -1234493200 expected 4
HAL Kern Error: Read failed from addr 20, read -1234493200 expected 4
Segmentation fault

Is multithreadding not supported on arm32?

7 Replies

  • AnilErinch_A_Intel's avatar
    AnilErinch_A_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi ,

    Can you please post the results of "lsmod" command in the device terminal after the issue has happened.

  • TSchw11's avatar
    TSchw11
    Icon for New Contributor rankNew Contributor

    Hi,

    ​thank you for your response.

    The output of lsmod seems to always be:

    root@socfpga:~# lsmod                                                          
    Module                  Size  Used by
    aclsoc_drv              8916  0 
    root@socfpga:~#

    While reproducing the error, I observed one additional erroneous behavior:

    Sometimes, the application just doesn't finish and instead hangs. I've attached the output of several program executions. If the program didn't finish, I aborted it with control-C, the application did still react to that. In the attached log these cases can be identified by the "^C" at the end of the program output.

  • AnilErinch_A_Intel's avatar
    AnilErinch_A_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi are you using a custom made board for running the openCL.

    In that case please use the following api in the code aocl_mmd_get_info

    and let us know the results.

    In particular we would be interested in results with following

    AOCL_MMD_BOARD_NAME

    AOCL_MMD_CONCURRENT_READS Number of parallel reads( A value of 1 indicates serial reads.)

    AOCL_MMD_CONCURRENT_WRITES Number of parallel writes (A value of 1 indicates serial writes).

    AOCL_MMD_CONCURRENT_READS_OR_WRITE S Total number of concurrent read and write operations .

    Please refer here

    https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/opencl-sdk/ug_aocl_custom_platform_toolkit.pdf

    • jackgreen's avatar
      jackgreen
      Icon for New Contributor rankNew Contributor

      Hi I am having a similar problem on the de10 nano board. After checking the corresponding source code, I found out that the MMD library is of version 14.1 and it does not have AOCL_MMD_CONCURRENT_READS and AOCL_MMD_CONCURRENT_WRITES defined. So I guess it might be the problem? And how shall we update this library? Is there any off-the-shelf code I could take advantage of?

      Thanks!

    • jackgreen's avatar
      jackgreen
      Icon for New Contributor rankNew Contributor

      Hi I am having a similar problem on the de10 nano board. After checking the corresponding source code, I found out that the MMD library is of version 14.1 and it does not have AOCL_MMD_CONCURRENT_READS and AOCL_MMD_CONCURRENT_WRITES defined. So I guess it might be the problem? And how shall we update this library? Is there any off-the-shelf code I could take

  • TSchw11's avatar
    TSchw11
    Icon for New Contributor rankNew Contributor

    Hi,

    sorry I can't currently get that Information as the University is currently closed (because of corona) and I'm at home without FPGA.

    Using a chroot environment I could however find out that AOCL_MMD_VERSION is only 14.1 so this is likely to be the problem, I think.

  • AnilErinch_A_Intel's avatar
    AnilErinch_A_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi ,

    Please let us know the results , after you get a chance to update the MMD .

    Stay Safe

    Regards

    Anil