Forum Discussion

mfortuna's avatar
mfortuna
Icon for New Contributor rankNew Contributor
1 month ago

DDR4 Problem Migrating from Arria 10 016 to 048

We have a board design that supports an F29 package Arria 10. It has two DDR4 interfaces running at an 1866 data rate.

In the past an 016 chip with DDR4-2400 ICs worked fine.

We needed more logic so we migrated to an 048. Because of parts availability we had to migrate to DDR-3200 ICs.

The board layout has not changed and we believe the various power supplies and clocks are working fine.

The 048 design fails calibration and won't even allow running "EMIF toolkit - Create Memory Interface." It hangs and times out. If I turn off address / command leveling the design still fails calibration but now I can create a memory interface and see the calibration results which show a complete failure with write and read margins. The design is badly broken but we don't know if it is the 048 or DDR4, or a combination of both. The DDR4 IP was regenerated and adjusted for the DDR4-3200 ICs.

We do see PLL lock with signaltap and can actually run Efficiency monitor. So the core is supplying a good clock to the fabric.

Any suggestions are welcome!

Thanks,

Mike

5 Replies

  • I can think of the cause is only power capability issue across migration devices.    Especially static power will increase from 016 -> 048 density.   Do you use the single DDR4 example design to see the calibration result ? 

  • mfortuna's avatar
    mfortuna
    Icon for New Contributor rankNew Contributor

    I built the example design and it fails the same way. If I enable address/command leveling EMIF toolkit hangs. If I disable address/command leveling, the toolkit reports write-per-bit deskew failed. 

    The 016 based board works fine.

    Another datapoint is the 048 works fine except for DDR4. The design has PCIe Gen3 x 8 and four 12.5G transceivers. The boards were designed with the upgrade from 016 to 048 in mind.

    I did see another user had intermittent errors with an Arria 10 and DDR4-3200. But they were able to calibrate.

    Address command leveling is pretty basic. write-per-bit deskew may be a red herring.

     

  • mfortuna's avatar
    mfortuna
    Icon for New Contributor rankNew Contributor

    I decided to try very basic testing of the 016 and 048 example designs.

    The 016 design drives mem_cs_n to 1.2V and pulses it low at a regular rate, presumably refresh. If I rerun calibration using emif toolkit, I see lots of activity on on mem_cs_n.

    The 048 design has mem_cs_n at 0.6V (due to termination). There is no activity. It appears the pin is not driven by the FPGA.

    The 048 and 016 have correct pinouts and bank voltages. I used a non-DDR4 048 design to drive the mem_cs_n pin with a square wave and it works fine.

    It appears the build is broken. I tried both Quartus 22.4 and 23.2 and both produce the same results. 

    Not being able to build and run the example design appears to be a tool issue. 

     

    • yoichiK_altera's avatar
      yoichiK_altera
      Icon for Contributor rankContributor

      Did you monitor the mem_cs_n pin with DDR4 048 design right after rerun calibration by toolkit ? After calibration failed no refresh command should be issued so no active on mem_cs_n pins.

    • mfortuna's avatar
      mfortuna
      Icon for New Contributor rankNew Contributor

      Despite our best efforts of doing a BOM scrub and checking the board, the HW designer and I did not see two important resistors missing, namely the RZQin resistors for each bank of DDR. With those in place the design is up and running.

      I know this is difficult for a SW tool to detect but hanging address / command leveling or reporting a write skew failure definitely is not helpful for debugging an issue.

      The smoking gun, for anyone that has the issue, is the mem_cs_n pin is not driven to 1.2v and instead parks at the termination voltage.