Arria 10 HPS Bridge Lockup after Reset
I'm having an incredibly frustraiting issue with an Arria 10 HPS which is bordering on the bizarre.
Short story:
- The processor boots and runs fine on first run, all peripherals in the FPGA can be accessed fine.
- After issuing a cold reset, any attempt to access peripherals in the FPGA locks up the processor.
- After the watchdog resets the processor, access to peripherals is fully functional again.
This is a highly repeatable issue - after any attempt to reset the processor from the FPGA, I can no longer access peripherals in the FPGA from the processor. After the watchdog resets it, all is fine again.
---
The HPS device is instantiated in Platform Designer as follows (there are other components but that makes no differencce at this point).
Currently the higher up the signals are assigned as:
- `hps_warm_reset` is tied to 0 (not used)
- `hps_cold_reset` is the signal I am asserting to trigger a reset of the processor
- `hps_h2f_cold_reset` is fed out of the block and along with other sources is ultimately used to generate the `lwint_reset` and `proc_reset` signals which are synchronised to their corresponding clock domain (async assert, sync deassert).
- `hps_axi_slave` is unused (terminated)
- `hps_axi_master` is connected to various sources and runs on one clock domain (200MHz)
- `h2f_lw_axi_master` is connected to some simple peripherals - e.g. a system ID as shown in the screenshot above. This is on a second clock domain (100MHz).
- `hps_io` is connected to the `lwint_reset` and `proc_reset` signals so I can monitor whether the domains are active or held in reset.
There is actually a lot of other stuff in the design not shown (e.g. PCIe core) but that shouldn't have any impact.
---
On power up, the FPGA portion of the Arria 10 SX660 device is configured from an ASx4 source. The HPS is also reset using the external physical pins, and configured to boot from an EMMC device.
The physical HPS reset pins are released after the FPGA is configured. However it remains held in reset by the f2h_cold_reset_req signal in the FPGA until I am ready for the processor to boot.
Once released, a customised U-Boot SPL preloader is launched. The modification to the standard preloader beyond device_tree settings, is just to remove SDRAM configuration as there is no DDR attached and the HPS EMIF is disabled, along with disabling the data/instruction caches and the MMU (*).
The preloader boots through fine. It then launches a bare-metal image from the EMMC in on-chip RAM. There is no OS - I'm basically trying to use it as a replacement for a Nios processor from an older design.
The bare-metal image checks and waits until the FPGA AXI masters are released from reset, and then enables the HPS bridge. The same behaviour happens if I let the preloader enable the HPS bridge before the bare metal application is run, so there shouldn't be anything specifically wrong with the bridge configuration.
Once the bridges are enabled, I try to access a peripheral in the FPGA, such as the system ID.
During the first run after power-on, everything works fine. The bridges are enabled and I can read the system ID OK.
----
If I then issue a cold reset request from the FPGA (not the phsyical reset pins), the processor will reboot.
I see the preloader runs through fine, and my bare metal application is successfully launched.
The HPS bridges are enabled fine (both in ALT_RSTMGR_BRGMODRST_ADDR register, and in the ALT_SYSMGR_NOC_IDLESTAT_ADDR register show them as not in reset and not idle).
However as soon as the processor tries to read any register address on the FPGA side of the bridge (both lightweight and regular interfaces), the processor locks up and the debugger can no longer interact with it. (**)
Eventually after about 10 seconds, the watchdog resets the processor, it reboots, runs through the preloader, and launches my bare metal application. Everything works fine again - I can properly access peripherals in the FPGA.
Any thoughts what is wrong here? Or even any ideas where I could start looking?
There seems to be very little information about using this for anything other than booting Linux, so if there are any resources for bare-metal applications out there, it would be handy to point me to them also.
---
(*) If the caches aren't disabled, everything is incredibly unstable and there are lots of spurious data aborts, so I've made sure they are fully disabled by the preloader
(**) curiously the debugger is able to access peripherals in the FPGA after a reset, but as soon as the processor tries to access anything, it locks up.