BER Degradation Observed When Enabling Multiple DFE-Adapted Channels on Arria 10 GX
Hello Altera Support Team / Forum Members,
We are currently conducting a comprehensive transceiver channel test on our Arria 10 GX FPGA (part number: 10AX066K4F35M3SG and 10AX066K1F35I1SG). Our test setup and configuration parameters are as follows:
Transceiver Configuration Rule: Basic (Enhanced PCS)
PMA Configuration Rule: Basic
Transceiver Mode: TX/RX Duplex
Data Rate: 10 Gbps
CDR Reference Clock Freq : 200MHz
Number of CDR Reference clocks : 1
Selected CDR reference clock : 0
Test Pattern: External PRBS31
Measurement Tool: Transceiver Toolkit (Quartus Prime 20.2)
Test Setup Description:
Our system consists of a carrier board, an FPGA board, and a passive loopback connector. The carrier board contains no active components. The FPGA board is populated with various discrete interface components—including SDRAM, SRAM, FLASH, power sequencers, oscillators, clock buffers, and clock generators—to provide all necessary interfaces for the FPGA.
The TX channels generated on the FPGA board are routed down to the carrier board, where they are looped back via the loopback connector and returned to separate RX channels on the FPGA. It should be noted that the TX and RX pairs are mostly located in different banks. The trace lengths of each channel along the loopback path vary between 8 inches and 11 inches.
Reference Clock Architecture:
We generate a 50 MHz signal using an on-board oscillator, pass it through a low-jitter clock buffer, and then feed the buffered output into a clock generator to produce the transceiver reference clocks. For the user clock, we apply a separate 100 MHz oscillator output directly to the Clkuser pin.
Observed Behavior:
During our test campaigns, we have achieved BER results on the order of 1e‑18 across many of the 36 looped‑back transceivers. In an effort to further minimize errors, we have selected pre‑emphasis, CTLE, and DFE settings within the Transceiver Toolkit that yield a zero‑error condition (i.e., no observed errors).
Critical Issue:
We are facing a significant inconsistency. When we test the 36 transceiver channels in four separate runs (9 channels per run), with VGA, EQ Control, and DFE parameters already set, we observe no errors for each channel during temperature cycling from 60 °C to 90°C (die temperature) and back down to 60 °C. Under these conditions, the BER remains zero.
However, when we increase the number of channels with DFE adaptation enabled to the range of 12 to 15, we begin to observe errors on channels that previously exhibited no errors.
Questions:
What could be causing this degradation when the number of DFE‑adapted channels is increased? Are we exceeding some power, thermal, or resource limitation? Could there be an interaction between DFE‑enabled channels in adjacent banks or through the shared clocking/power distribution networks?
Could this issue be attributed to silicon-level crosstalk between the DFE-adapted channels? More specifically, is it possible that enabling a larger number of DFE-adapted transceivers introduces additional noise coupling or interference within the FPGA silicon, potentially degrading the signal integrity of adjacent or nearby channels? If so, what would be the recommended approach to isolate or mitigate such effects in our current design and test environment?
Additionally, we would like to ask: is there a known limitation on the number of transceivers that can reliably support a 10 Gbps data rate simultaneously across a wide temperature range ? Any guidance on debugging or resolving this issue would be greatly appreciated.
Thank you in advance for your support.
Regards,
Onur