R-Tile Avalon Streaming PIPE Direct x16: Locks COM(K28.5) Symbols correctly but some lanes do not.
Hello, I am implementing a custom soft PCIe/CXL link layer and LTSSM using R-Tile Avalon Streaming IP in PIPE Direct mode, configured as x16.
At the moment, link training does not reliably move forward because some lanes receive valid COM/K-code alignment, but the following ordered-set symbols are corrupted.
Environment
- Device / board: AGIB027R29A
- IP: R-Tile Avalon Streaming FPGA IP for PCI Express
- Mode: PIPE Direct
- Link width: x16
- Current focus: Gen1 training / Polling / Configuration
- Custom implementation:
- custom LTSSM
- custom symbol lock using COM (K28.5)
- custom TS1/TS2 decode logic
Symptom
In Polling.Active and Polling.Configuration , I can see that some lanes captures/decodes TS1/TS2 correctly, but some lanes do not.
For example, in the attached SignalTap screenshot:
- Lane 9 appears to decode the TS2 sequence correctly.
- Lane 8 shows COM (K28.5) and PAD (K23.7) correctly, but the symbols after that are unstable / corrupted.
From the screenshot:
- Lane 9 example:
- K28.5, K23.7, K23.7, D24.0, D30.0 D00.0, repeated
- Lane 8 example:
- K28.5, K23.7 are visible,
- but the following TS2 fields fluctuate and do not remain valid/stable.
So it looks like:
- COM-based symbol lock is working at least partially
- but after COM/PAD, the ordered-set contents on some lanes(random) are corrupted before my soft IP can decode them correctly
To verify whether this was caused by my own logic, I captured the affected lanes directly in SignalTap using the first raw 10-bit RX data from the PIPE Direct IP (`ln*_pipe_direct_pipe_rxdata_o`), before any symbol lock/decoding stage in my soft IP. I searched for the COM symbol directly in this raw 10-bit stream and confirmed that the corruption is already present at the PIPE Direct IP output. So this does not appear to be caused by my combinational decode logic; the raw RX data delivered by the IP is already corrupted on those lanes.
What I already checked
I already checked the following items carefully:
- Gen1 rxdata interpretation
- I only decode valid 10-bit portions for Gen1
- I do not interpret the don't-care bits in rxdata[31:10] and rxdata[63:42]
- rxdatavalid qualification
- TS decode / symbol shift only happens when rxdatavalid0/1 are valid
- Sampling clock
- SignalTap capture is done in the corresponding lane RX clock domain
- not with a shared TX/fabric clock
- Reset sequence
- pld_pcs_rst_n_i release is gated after per-lane tx_transfer_en_o
- I also reviewed cdrlock2data, reset_status_n, phystatus, powerdown sequencing
- Deskew-related status
- active channels are detected
Current question
At this point, I suspect one of the following:
- lane-specific analog/RX quality issue inside or before PIPE Direct output
- lane-specific reset/power-up timing issue
- internal alignment / deskew behavior that I am misunderstanding
- some required PIPE Direct control/sideband setting that I am missing
What I would like to ask
- In PIPE Direct x16 Gen1, if one lane shows valid K28.5 / K23.7 but the following TS2 symbols are corrupted, what should I check first on the R-Tile side?
- Are there any lane-specific PMA / RX / PIPE Direct controls that should be reviewed for this symptom?
- Is there any recommended way to determine whether this is:
- a true lane analog/RX problem,
- a deskew/alignment issue,
- or a reset/bring-up sequence issue?
- Are there any known recommendations for validating lane integrity directly at the PIPE Direct output during Polling.Configuration?