Forum Discussion

VDemc2's avatar
VDemc2
Icon for New Contributor rankNew Contributor
6 years ago

Stuck during execution on Mustang-F100-A10, Intel® Vision Accelerator Design with Intel® Arria® 10 FPGA

Hello,

we have Mustang F100-A10 and TANK-870 AIoT Dev. Kit.

We have followed the official guide for the latest openvino to setup the 2019R1 openvino and our FPGA card.

We were able to run "aocl diagnose" and we see there is the card as you can see below.

ieisw@ieisw-SER0:~$ aocl diagnose
 
--------------------------------------------------------------------
 
Device Name:
 
acl0
 
 
 
BSP Install Location:
 
/opt/altera/aocl-pro-rte/aclrte-linux64/board/a10_1150_sg1
 
 
 
Vendor: Intel(R) Corporation
 
 
 
Phys Dev Name  Status   Information
 
 
 
acla10_1150_sg10Passed   Intel Vision Accelerator Design with Intel Arria 10 FPGA (acla10_1150_sg10)
 
                       PCIe dev_id = 2494, bus:slot.func = 01:00.00, Gen3 x8
 
                       FPGA temperature = 61.6406 degrees C.
 
 
 
DIAGNOSTIC_PASSED
 
--------------------------------------------------------------------
 

The

./demo_squeezenet_download_convert_run.sh -d HETERO:FPGA,CPU

succeeded as well as demo_security_barrier_camera.sh with device HETERO:FPGA,CPU

However when we try to run classification_sample demo as below, it sometimes stuck forever. Not after certain amount of inference calls but randomly. Sometimes it finished with -ni set to 3000 but sometimes it stuck when we set -ni to 10. Below is the command we used.

/<path>/inference_engine_samples_build/intel64/Release/classification_sample -i /opt/intel/openvino/deployment_tools/demo/car.png -m /<path>/squeezenet1.1_FP16/squeezenet1.1.xml -d HETERO:FPGA,CPU -ni 10

The same thing happens with our custom code, which runs without issue on CPU.

While this execution is running we are not able to call "aocl diagnose" at all.

We are really out of ideas and we are starting to suspect that the FPGA card is broken. So any help or suggestion is highly appreciated.

My second question is if it's possible to call "aocl diagnose" from two different terminals simultaneously.

PS: We have skipped the part in documentation which is about USB Blaster (we dont have this thing).

37 Replies

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    I understand but unfortunately currently the OpenVINO for FPGA does not support docker as it is not validated in this setup

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      "OpenVINO for FPGA does not support docker" I am confused a bit.

      We run FPGA on host (not in docker) and there are 2 docker containers running on the same host BUT he FPGA application does not communicate with those docker containers.

      They run as separated processes.

      I assume there is a simultaneous access from FPGA and docker engine which cause freezing issue.

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    What do you mean by running 2 docker container running on the same host? May I know what is the docker container is running?

    The reason is that from your information, the issue is not observed when the docker engine is not run. This looks like it is related to your PC issue rather than FPGA issue.

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      The issue has been observed on two different computers with FPGA so I dont think it's related to PC/card issue.

      As I stated in the beginning we have IEI Tank and Mustang F100 inside with Ubuntu.

      We have docker installed on the Ubuntu

      We have Openvino R1 installed on the Ubuntu.

      When we run simple openvino application which runs on HETERO:FPGA, CPU device it worked and process whole video file which has around 600 frames.

      When we start up docker-compose with completely separated logic (just kafka and zookeeper as message broker) and we run the simple openvino application again the openvino application freezes.

      This freezing issue doesn't have anything with "running FPGA in docker" since as I mentioned we run openvino application locally on host (openvino is installed on host directly, aocl is installed and programmed locally directly on the host).

      In short the freezing issue has been observed when we had following setup at once:

      1. We have Message broker system (doesnt have anything related to openvino/FPGA) - We have 2 docker containers running ( official zookeeper image + kafka confluent official image)
      2. We run openvino application with device HETERO:FPGA,CPU locally

      When we stop both docker containers (docker engine is still running as service but no docker containers are running) and we run the openvino application it doesnt freeze.

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    May I know if everything is running on local machine without docker, are you still observing the issue? I would like to see if the Message broker is running in local machine is also causing the freeze issue or not.

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      Yes if everything is running on local machine without any docker container we didnt see the issue.

      We tried to run message broker - zookeeper and kafka - locally on host and we didnt see the freezing issue.

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    It looks like the docker container is causing the issue. You will need to debug to see why the debug container is causing the OpenVINO to hang.

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      It didnt happen on R5. docker version is the same as it was while we were using openvino R5. So IMHO it looks like there is nothing to debug on our side (docker containers we use are not our).

      However could you please :

      1. list all log files where FPGA plugin and HETERO:FPGA,CPU plugin and fpga itself store any logs
      2. Any way how to get any informations regarding utilization of FPGA while there is something running on the board?
  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi.

    We are able to duplicate this issue and already feedback this issue to engineering to fixed the issue.

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      Thanks for letting us know.

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      please keep us informed on the progress

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    We have tested the issue in upcoming release (2019R3) and it is no longer observed. The new OpenVINO 2019R3 is schedule to be available next week.

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      that's cool. Looking forward to test it on 2019R3 next week!

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    The 2019R3 has been released. Please test it out and let me know if you are still observing the issue

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      Hi JohnT,

      it looks like the freezing issue has been resolved and it is working for both approaches:

      1. Openvinotest application and kafka + zookeeper docker. https://github.com/VladoDemcak/ovdebug/blob/master/openvinotest.py
      2. Human pose and chrome. But I don’t see better performance on HETERO:FPGA,CPU compared to CPU i7. Is it ok? What are your observations or what do you thing?

      BTW I had another issue with the pose_estimation demo when it stuck on “Parsing input parameters” while running with HETERO:FPGA,CPU. In few cases the whole machine stopped and I needed to hard reset.

      Maybe I was using invalid bitstream. Will try to test it next week and with our application.

      Which bitstreams are suitable for human_pose_estimation model?

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    I am glad that you are able to make it work.

    May I know which bitstream are you using to run on the Human Pose?

    We performed the benchmark using 2019R3_PV_PL1_FP16_ELU or 2019R3_PV_PL1_FP11_YoloV3_ELU bitstream.

    • VDemc2's avatar
      VDemc2
      Icon for New Contributor rankNew Contributor

      We use 2019R3_PV_PL1_FP16_MobileNet_Clamp.aocx

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    Could you check the performance using 2019R3_PV_PL1_FP16_ELU or 2019R3_PV_PL1_FP11_YoloV3_ELU bitstream?