Forum Discussion

aejjeh's avatar
aejjeh
Icon for New Contributor rankNew Contributor
6 years ago

Arria 10 GX FPGA Dev Kit OpenCL and PCIe drivers/runtime stopped working

We have an Arria 10 GX FPGA Dev Kit that we've been using for more than a year with the Intel FPGA SDK for OpenCL version 18.0 on Ubuntu 16.04. Recently (last week), the FPGA stopped working with the OpenCL runtime, running aocl diagnose gives me the followign error message:

Found no active device installed on the host machine.
 
Please make sure to: 
      1. Set the environment variable AOCL_BOARD_PACKAGE_ROOT to the correct board package.
      2. Install the driver from the selected board package.
      3. Properly install the device in the host machine.
      4. Configure the device with a supported OpenCL design.
      5. Reboot the machine if the PCI Express link failed.
 
DIAGNOSTIC_FAILED

I tried to move from 18.0 to 19.1, however the issue still remains. Any thoughts on what might be the problem? Could a kernel update to the OS have been the cause of this issue? (I am currently checking with our IT department to see if they might have issued an OS update to the system).

13 Replies

  • MEIYAN_L_Intel's avatar
    MEIYAN_L_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi,

    May I know the information as below:

    1. Did aocl install pass?

    2. Did jtagconfig pass?

    3. Did aocl program pass?

    4. Does the PC reboot when program the FPGA?

    5. What is the LCD display on the devices?

    Thanks

    • aejjeh's avatar
      aejjeh
      Icon for New Contributor rankNew Contributor

      Hi Mylee

      Here are my answers to your question:

      1) I think so, here is the output:

      root@hpvmfpga:[/home/aejjeh]: aocl install
      Do you want to setup the FCD at directory /opt/Intel/OpenCL/Boards [y/n] y
      aocl install: Adding the board package /opt/intelFPGA_pro/19.1/hld/board/a10_ref to the list of installed packages
      aocl install: Setting up the FPGA Client Driver (FCD) to the system.
      Install the FCD file to /opt/Intel/OpenCL/Boards 
      Installing the board package driver to the system.
      aocl install: Running install from /opt/intelFPGA_pro/19.1/hld/board/a10_ref/linux64/libexec
      Looking for kernel source files in /lib/modules/4.15.0-66-generic/build
      Using kernel source files from  /lib/modules/4.15.0-66-generic/build
      Building driver for BSP with name a10_ref
      make: Entering directory '/usr/src/linux-headers-4.15.0-66-generic'
        CC [M]  /tmp/opencl_driver_6kHrDJ/aclpci_queue.o
      /tmp/opencl_driver_6kHrDJ/aclpci_queue.c: In function ‘queue_push’:
      /tmp/opencl_driver_6kHrDJ/aclpci_queue.c:133:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
         void* dest = queue_addr(q, loc);
         ^
        CC [M]  /tmp/opencl_driver_6kHrDJ/aclpci.o
        CC [M]  /tmp/opencl_driver_6kHrDJ/aclpci_fileio.o
        CC [M]  /tmp/opencl_driver_6kHrDJ/aclpci_dma.o
        CC [M]  /tmp/opencl_driver_6kHrDJ/aclpci_pr.o
        CC [M]  /tmp/opencl_driver_6kHrDJ/aclpci_cmd.o
      /tmp/opencl_driver_6kHrDJ/aclpci_cmd.c: In function ‘aclpci_exec_cmd’:
      /tmp/opencl_driver_6kHrDJ/aclpci_cmd.c:176:5: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
           size_t bytes_copy = strnlen(ACL_BOARD_PKG_NAME, BUF_SIZE) + strnlen(ACL_DRIVER_VERSION, BUF_SIZE) + 2;  // 1 for '.', 1 for '\0'
           ^
        LD [M]  /tmp/opencl_driver_6kHrDJ/aclpci_a10_ref_drv.o
        Building modules, stage 2.
        MODPOST 1 modules
        CC      /tmp/opencl_driver_6kHrDJ/aclpci_a10_ref_drv.mod.o
        LD [M]  /tmp/opencl_driver_6kHrDJ/aclpci_a10_ref_drv.ko
      make: Leaving directory '/usr/src/linux-headers-4.15.0-66-generic'

      2) Yes, jtagconfig passes

      root@hpvmfpga:[/home/aejjeh]: jtagconfig
      1) USB-BlasterII [2-1.7]                      
        02E660DD   10AX115H1(.|E2|ES)/10AX115H2/..
        020A40DD   5M(1270ZF324|2210Z)/EPM2210

      3) I cannot run aocl program because aocl does not detect the device to start with

      4) Not sure what you mean here, I reboot the machine manually when I am trying to initialize the board. I have a script that I use to initialize the board based on the AN 807 Intel document: https://www.intel.com/content/www/us/en/programmable/documentation/tgy1490191698959.html#wmh1490212984610

      Basically, I set the jtag speed to 6M, then I run the following two commands:

      quartus_pgm -c 1 -m JTAG -o "p;max5_150.pof@2"
              quartus_pgm -c 1 -m JTAG -o "p;top.sof"

      After that I do a soft reboot. When the reboot is done, I used to run aocl install and then the board would work.

      5) The board is inside the PC chassis. I cannot see the LCD display while the board is connected to PCIe.

    • aejjeh's avatar
      aejjeh
      Icon for New Contributor rankNew Contributor

      Hi Mylee

      The usb cable is working, jtag-config detects the board and works. I can program the board with no problem. Also, as I mentioned, the board was working previously, and I have previously configured all switches and jumpers according to an807. I confirmed that there was a kernel update prior to the board stopping to work.

  • MEIYAN_L_Intel's avatar
    MEIYAN_L_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi,

    I am understand with the problem you are mentioned.

    May I know do you have other Arria10 GX FPGA Dev board?

    If yes, may I know other Arria10GX FPGA board have these problem?

    Thanks

  • MEIYAN_L_Intel's avatar
    MEIYAN_L_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi,

    As mentioned earlier, you had update the linux kernel. It might be not compatible with the FPGA driver. Do you have install the PCIE driver after you update the linux kernel?

    Also, you may need to do the steps:

    1. Install the driver from the selected board package.
    2. Properly install the device in the host machine.

    Thanks

    • aejjeh's avatar
      aejjeh
      Icon for New Contributor rankNew Contributor

      Hi Mylee

      To answer your first question, no we do not have another Dev Board to try out.

      As for the linux kernel update, yes I performed "aocl install" AFTER the linux kernel got updated. I have posted the output of "aocl install" in one of my previous messages. The device is definitely properly installed, I have not removed it.

  • MEIYAN_L_Intel's avatar
    MEIYAN_L_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi,

    From the initial assessment, it might be due to the driver compatible issue in new kernel/OS.

    To confirm the driver compatible issues in new kernel/OS, can you try to compile an opencl example like hello_world (provided) in emulator? This is done to confirm there is no issues in tools.

    Also, I would like to know the kernel/OS is updated from Ubuntu which version and now using which Ubuntu version?

    Thanks

    • aejjeh's avatar
      aejjeh
      Icon for New Contributor rankNew Contributor
      Hi Mylee Our system has been, and still is running Ubuntu 16.04. The kernel version that is running is 4.15.0-66. Please note that I do not know which version of the kernel was running when the board was working; the machine is managed by our University’s IT department and gets updated by them. -Adel
  • MEIYAN_L_Intel's avatar
    MEIYAN_L_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi,

    I would like to confirm the information as below, can you try to answer:

    1. Have you compile the OpenCL example in emulator successfully?
    2. Ubuntu Version before update
    3. Ubuntu version after update is 16.04?

    Thanks

    • aejjeh's avatar
      aejjeh
      Icon for New Contributor rankNew Contributor
      Hi Mylee The Ubuntu version did not change. It's 16.04. Only the kernel was updated. And yes, I can compile with emulator without problems. -Adel Get Outlook for iOS<https://aka.ms/o0ukef>
  • MEIYAN_L_Intel's avatar
    MEIYAN_L_Intel
    Icon for Frequent Contributor rankFrequent Contributor

    Hi,

    From the description, it might be high possibility the problem coming from the kernel version.

    Can you revert back the kernel version to the previous version?

    And can you let us know the older kernel version number that you used?

    Thanks

  • GRN's avatar
    GRN
    Icon for Occasional Contributor rankOccasional Contributor

    Hi aejjeh,

    Try to downgrade the kernel to 4.4 version

    Install kernel source and headers (via apt-get install)

    Try aocl install again