Forum Discussion

FabianL's avatar
FabianL
Icon for Occasional Contributor rankOccasional Contributor
14 hours ago

Arria 10: Remote Update Factory Fallback won't work & Watchdog does not trigger

Hello,


I have to reopen another topic from last year:

Arria 10: Remote Update may brick FPGA and Factory Fallback won't work | Altera Community - 315011

Opposed to my  comments in the original thread, enabling the watchdog does not trigger a factory fallback if the application Image is wrongly aligned.

This brings me back to this scenario of the original post:

  1. Invalid application load image location, i.e. start of application load is shifted by1-10 Byte (Manually induced error scenario) --> The reprogramming sequence starts but never completes and no fallback to the factory load is performed. => The FPGA is completely unresponsive unless programmed via JTAG

It is obvious, that the this scenario might be an exotic error scenario, however we require a robust setup and have to make sure, that the FPGA remains accessible under any circumstances, so we need the Factory Fallback mechanism to work reliable!

 

We have this boot procedure:

    1. Boot into factory image (0x20 as boot address in flash boot sector 0x00 to 0x1F). We have certain HW which is sensible to boot up timing so we need this to guarantee an identical and reliable boot up procedure.
    2. Boot from factory load into application image
      1. Check for power up boot: Read RU_RECONFIG_TRIGGER_CONDITIONS register for power up state (0)
        • do not reconfigure if Bit 4,2,1,0 is set
      2. Set AnF bit: write "1" to RU_CONFIGURATION_MODE
      3. Set application image address RU_PAGE_SELECT
      4. Enable Watchdog Set RU_WATCHDOG_TIMEOUT & RU_WATCHDOG_ENABLE
      5. Reconfigure: write "1" to RU_RECONFIG
    3. In Application mode we only read the RU_RECONFIG_TRIGGER_CONDITIONS as status info
      • We do not write the RU_WATCHDOG_ENABLE nor RU_RESET_TIMER registers

I have run tests, with a Application Image being stored with an offset of -2 Bytes, i.e. the first 2 Bytes of the Application image are not stored in Flash Memory and the full image is shifted in its Flash storage. In this case, the FPGA gets stuck in an unresponsive state, when trying to load the application image.

There is no fallback to the factory load happening, no CRC error, no watchdog triggering.

As a best guess I could assume it might be related to this Note in 1.3.1. Remote System Configuration Mode that the factory fallback mechanism won't work for Arria 10 FPGAs if the last 576 Bytes of the bitstream are corrupted.

Note: The fallback to the factory image does not work under the following conditions: If the last 576 bytes of an unencrypted application image bitstream are corrupted. Intel recommends that you examine the last 576 bytes of the unencrypted application image before triggering the application image configuration.

But I have noticed that the binary images of the FPGA bitstream vary in size. So there is no way to check explicit memory locations for these 576 Bytes. Is there any way to identify this section?

My Questions:

  1. Why is the factory configuration fallback mechanism not working in the above described scenario? The Factory load image is valid!
  2. How can I examine/validate a FPGA bitstream in flash memory before executing it?

 

best regards

Fabian

 

3 Replies

  • Farabi's avatar
    Farabi
    Icon for Regular Contributor rankRegular Contributor

    Hi Fabian, 

     

    1- Please dont do the 2-byte offset to trigger the CRC. You should delete some chunk of bitstream data and re-run to trigger the CRC.

    2- Can you compare the last 576-bytes of RPD file with your flash last 576-bytes? the contents MUST match if not this area might corrupt and possible the root cause of your fallback failure. 

     

    regards,
    Farabi

  • Farabi's avatar
    Farabi
    Icon for Regular Contributor rankRegular Contributor

    Hello Fabian, 

     

    I checked with internal team, the size of the bitstreams varies, and it does not have a fixed size. 

    Notes: The configuration bitstream is always the last block interpreted by FPGA, regardless of total image size. 

    So the it is important to understand that the last 576 bytes is relative to the end of the image, not an absolute flash address. 

    This block is processed before the FPGA can even attempt a configuration. 

    It consists of : 

    1- Configuration end markers - signal end of bitstream

    2- CRC/Checksum data - to verify data integrity

    3- Device configuration info - to confirm compatibility

    4- RSU-related metadata - Required before fallback

     

    If corrupted: 

    1- FPGA doesn't know this image is failed

    2- FPGA only know this image is invalid

    3- Impact to - No fallback path is taken

     

    I am checking how to validate the bitstream before we can proceed with RSU. I will get back after getting the confirmed answer. 

     

    regards,
    Farabi

  • Farabi's avatar
    Farabi
    Icon for Regular Contributor rankRegular Contributor

    Note: The fallback to the factory image does not work under the following conditions: If the last 576 bytes of an unencrypted application image bitstream are corrupted. Intel recommends that you examine the last 576 bytes of the unencrypted application image before triggering the application image configuration.

    But I have noticed that the binary images of the FPGA bitstream vary in size. So there is no way to check explicit memory locations for these 576 Bytes. Is there any way to identify this section?

    My Questions:

    1. Why is the factory configuration fallback mechanism not working in the above described scenario? The Factory load image is valid!
    2. How can I examine/validate a FPGA bitstream in flash memory before executing it. 

     

    Status: consulting engineering to check on factory fallback mechanism failure and how to confirm the memory location of this 576 bytes is corrupted or not. 

     

    regards,
    Farabi