Forum Discussion

VBalamurugan29's avatar
VBalamurugan29
Icon for New Contributor rankNew Contributor
6 months ago
Solved

Agilex 7 FPGA Short-Circuited Without Physical Damage – Need Guidance

Hello everyone,

I’m working on a custom board with an Intel Agilex 7 series FPGA. Recently, the FPGA appears to have short-circuited internally, despite no physical damage to the chip or board.

Key Observations:

  • The board was functioning normally before the incident.

  • No ESD event, mechanical impact, or thermal overrun was observed.

  • All voltage regulators are outputting correct voltages (verified via multimeter/oscilloscope).

  • After isolation, we found that the short is only on the FPGA side, not on the power rails or regulators.

My Questions:

  1. What could cause an internal short circuit in Agilex 7 without visible damage?

  2. Are there known failure modes or silicon issues that could explain this behavior?

  3. What are the recommended steps for RMA or root cause analysis from Intel?

If anyone has faced a similar issue or can point me toward relevant documentation or contacts at Intel for failure analysis, I’d really appreciate it.

Thanks in advance!

  • Hello,


    Please hold on we processing your FA.


    regards,

    Farabi


9 Replies

  • lixy's avatar
    lixy
    Icon for Contributor rankContributor

    Hi,

    For your questions 2 and 3, I think the suggestion we can give should be based on the exact failure symptom. Could you please confirm the following information?

    1- Did you measure certain Test Point on the board to confirm the "short"? What are the FPGA pins related to the "short" point?

    2- What is the exact failure you observed? Did you observe a specific functional failure? Are there some abnormal output signals? Or basically there's no function working?

    3- Did you check the configuration status of FPGA? If you try to detect and program the FPGA with Quartus Programmer, is there any error message in Programmer?

    For your question 1.

    Actually, according to functional failure cases reported by other customers, it is quite rare for us to see physical damage. Even for some units with a lot of pins short to ground, there's actually no visible damage.

    Some examples below may cause visible damage, but it still depends on the severity:

    -- Mechanical/Thermal induced: The unit was subjected to physical damage. The unit was subjected to high temperature reflow/rework process before baking to clear the humidity. (I think for such cases, customer generally do not find us. )

    -- Electrical Induced: The unit was subjected to excessively long term of high current.

    Best Regards,

    Xiaoyan


    • VBalamurugan29's avatar
      VBalamurugan29
      Icon for New Contributor rankNew Contributor

      Hi Xiaoyan,

      For your questions 1 and 2:

      After power sequencing and all Agilex power rails were confirmed to be stable then we started programming FPGA.

      I have started testing with DDR4.

      Initial testing up to 3 Gigabits showed no errors. When extending the address range for full DDR4 capacity, a sudden spike in load current was detected on the RPS. The board was immediately powered down to prevent damage.

      A detailed inspection of all power rails was conducted. The following Agilex 7 FPGA power rails were found shorted:

      Power Rail

      Voltage Level

      VCCRCORE

      1.2 V

      VCCL_SDM

      0.8 V

      VCCPLL_HPS

      1.8 V

      VCCPLL_SDM

      1.8 V

      VCC_ADC

      1.8 V

      VCCPT

      1.8 V

      Both FPGA and voltage regulator sides were isolated and tested independently. Short circuits were found only on the FPGA side. regulators were confirmed to be functioning correctly.

  • lixy's avatar
    lixy
    Icon for Contributor rankContributor

    Hi,

    1- Regarding the behavior that "there's a high current after DDR4 switched to full address range", as checked with our DDR related experts, we didn't see such issue before. When the address ranged expanded, only 1 or 2 address bits would be changed from 0 to 1, which normally won't lead to a significant change to the total current.

    Did you also checked whether these DDR address pins are short or not?

    When you are testing limited address range, is the DDR functioning normally?

    2- Regarding RMA or Failure analysis, please check this page: FPGA Functional/Failure Analysis, Quality, and Reliability Support |... .

    For Agilex 7 device, we can support Failure Analysis support. However, for your specific case, as the device was already massively damaged internally, what the FA can only do Short/Open/leakage test. Functional tests are definitely not applicable. Further tests like Physical check will not be supported, as there would have been massive burn inside of the unit. Even if we do physical check on the unit, it is almost impossible to understand what is the cause of the short from the device itself.

    Therefore, please consider what your expectation is. If the open/short test to confirm the failing pins could meet your needs, then we can apply for failure analysis.

    3- In cases of massive short failures, it is generally unlikely that the root cause is an internal silicon defect within the FPGA.

    While theoretically possible, silicon defects typically have limited impact under normal conditions and aren't expected to cause extensive external pin-level shorts based on our known cases.

    External factors such as excessive electrostatic discharge (ESD) / Electrical Overstress (EOS) or improper handling and testing are more likely to result in large-scale pin-level short circuits. As there are multiple voltage supply pins short, I would recommend you to pay attention to the power supply on your other good boards, do some tests such as monitor the power-up sequence and check whether the power supply has any overshoot during the FPGA operation.

    If this is a prototype board, then you may also need to review your PCB design and check whether the manufacturing and testing environment did have good protection to the board.


    Best Regards,

    Xiaoyan


  • Hi,

    Thanks for your reply.

    1. Yes, the DDR functioned normally when the DDR address was limited.

    2. No, I haven't checked the DDR address pins after the short circuit happened.

      (Note: This test was done before the short circuit occurred in the FPGA.)

    3. I understand that Failure Analysis (open/short/leakage tests) can be done.
      Yes, I would like to proceed with those tests to identify any failing pins.
      Please let me know the next steps.

  • Farabi's avatar
    Farabi
    Icon for Regular Contributor rankRegular Contributor

    Hello,


    Please hold on we processing your FA.


    regards,

    Farabi


  • Hi,


    I kindly fill in the DPR form and provide answer to the questions that I have send through forum email.


    Regards,

    Aiman


  • Hi,


    Who is your distributor/ local sales that you purchase the device from?


    Regards,

    Aiman


    • As we do not receive any response from you on the previous question/reply/answer that we have provided. Please login to ‘ https://supporttickets.intel.com’, view details of the desire request, and post a feed/response within the next 15 days to allow me to continue to support you. After 15 days, this thread will be transitioned to community support. The community users will be able to help you on your follow-up questions.