Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
13 years ago

FPGA instability after reset

Hi all,

I am working with a custom Cyclone IV E based HW that is suffering from strange startup instability.

The HW runs fine under various stress tests and temperature ranges when programmed through JTAG,

but if the FPGA configures from EPCS flash the main CPU (Nios2) will likely hang inside the bootloader

(running from internal RAM). If the bootloader code manages to complete the system will run fine. The

actual hang traces to SDRAM access - copying firmware code from EPCS to SDRAM. Interestingly

if after the hang SDRAM test program is downloaded through JTAG (w/ or w/o FPGA reconfiguration) it

will run without errors. Similarly the bootloader runs fine when programmed through JTAG.

The hang is more likely to occur when using the reset switch than at power on (reset switch is wired

to nCONFIG). The internal reset (CPU, ethernet Phy, PLLs, ...) is similar to 'global reset generator' used

in many Altera examples. But instead of one reset signal I use two, they assert simultaneousely and deassert

in sequence. The reset signal to deassert first goes to PLLs and the second to CPU, ... What I've found

is that the system instability is somehow related to PLLs reset. If I leave areset fixed to 0 than the system

is less likely to hang and if I increase the second reset time to ~100ms (first reset - PLLs - deaserts after 1ms)

the system doesn't hang. I am not comfortable with this workaround, since according to the datasheet max

PLL resync time is 1ms, beside the PLL locked signals are among my reset sources and I didn't see any

self reset during boot.

I could use some pointers. I didn't design the custom board I am using and it's not really my field, so

unfortunately I cannot give more details about the HW.

Thanks.

11 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    I've traced the system hangs to DDR calibration failure.

    --- Quote End ---

    Congrats! Its a pain to track subtle bugs down ...

    --- Quote Start ---

    The local_init_done signal from DDR stays 0 and ctl_cal_fail, ctl_init_fail are 1 when the boot hangs.

    The DDR calibration fails only when deaserting reset too fast (< ~100ms) after FPGA configuration.

    --- Quote End ---

    So why not take the easy way out then; add a reset component that is enabled at power-on after the external reset is deasserted, and it holds an internal reset asserted for at least 100ms (or resets the DDR for that long anyway).

    --- Quote Start ---

    The really strange behaviour I've noticed is that when the boot hangs it will continue to hang if FPGA is reset by toggling nCONFIG, if the power is turned on/off quickly it will still hang, turning power off and waiting a minute gets the system a chance to boot normally.

    --- Quote End ---

    An FPGA starts out 'from scratch' if nCONFIG is pulsed (although it won't go through its power-on-reset sequence). Its unlikely to be the FPGA that is the problem (unless you are violating an I/O voltage or power supply rise time requirement).

    I'd guess that you have something external that changes, eg., your DDR or some other device gets into a weird state that is stalling your boot.

    Keep in mind that when you configure an FPGA there is a large inrush current (on some of the supplies). If your power supplies are marginal (with respect to the current sourcing ability), then they might work sometimes, and not others.

    --- Quote Start ---

    How is the local_init_done meant to be used, in example designs I've seen it's left unconnected?

    --- Quote End ---

    Example designs are not always the best references. Read the DDR controller User Guide, trace the hardware with SignalTap II, try a few things, break things, fix them, ... that is the better way to understand how the IP cores need to be used. If you find something that does not work according to the IP User Guide, then file a Service Request with Altera. You'll get a much better response if you can show you've looked at the problem in detail and identified something that is not documented or is incorrectly documented.

    Cheers,

    Dave