Forum Discussion

Jayden's avatar
Jayden
Icon for New Member rankNew Member
1 day ago

R-Tile Avalon Streaming PIPE Direct x16: Locks COM(K28.5) Symbols correctly but some lanes do not.

Hello, I am implementing a custom soft PCIe/CXL link layer and LTSSM using R-Tile Avalon Streaming IP in PIPE Direct mode, configured as x16.

At the moment, link training does not reliably move forward because some lanes receive valid COM/K-code alignment, but the following ordered-set symbols are corrupted.

Environment

  • Device / board: AGIB027R29A
  • IP: R-Tile Avalon Streaming FPGA IP for PCI Express
  • Mode: PIPE Direct
  • Link width: x16
  • Current focus: Gen1 training / Polling / Configuration
  • Custom implementation:
    • custom LTSSM
    • custom symbol lock using COM (K28.5)
    • custom TS1/TS2 decode logic

Symptom

In Polling.Active and Polling.Configuration , I can see that some lanes captures/decodes TS1/TS2 correctly, but some lanes do not.

For example, in the attached SignalTap screenshot:

  • Lane 9 appears to decode the TS2 sequence correctly.
  • Lane 8 shows COM (K28.5) and PAD (K23.7) correctly, but the symbols after that are unstable / corrupted.

From the screenshot:

  • Lane 9 example:
    • K28.5, K23.7, K23.7, D24.0, D30.0 D00.0, repeated
  • Lane 8 example:
    • K28.5, K23.7 are visible,
    • but the following TS2 fields fluctuate and do not remain valid/stable.

So it looks like:

  • COM-based symbol lock is working at least partially
  • but after COM/PAD, the ordered-set contents on some lanes(random) are corrupted before my soft IP can decode them correctly

To verify whether this was caused by my own logic, I captured the affected lanes directly in SignalTap using the first raw 10-bit RX data from the PIPE Direct IP (`ln*_pipe_direct_pipe_rxdata_o`), before any symbol lock/decoding stage in my soft IP. I searched for the COM symbol directly in this raw 10-bit stream and confirmed that the corruption is already present at the PIPE Direct IP output. So this does not appear to be caused by my combinational decode logic; the raw RX data delivered by the IP is already corrupted on those lanes.

What I already checked

I already checked the following items carefully:

  1. Gen1 rxdata interpretation
    • I only decode valid 10-bit portions for Gen1
    • I do not interpret the don't-care bits in rxdata[31:10] and rxdata[63:42]
  2. rxdatavalid qualification
    • TS decode / symbol shift only happens when rxdatavalid0/1 are valid
  3. Sampling clock
    • SignalTap capture is done in the corresponding lane RX clock domain
    • not with a shared TX/fabric clock
  4. Reset sequence
    • pld_pcs_rst_n_i release is gated after per-lane tx_transfer_en_o
    • I also reviewed cdrlock2data, reset_status_n, phystatus, powerdown sequencing
  5. Deskew-related status
    • active channels are detected

Current question

At this point, I suspect one of the following:

  • lane-specific analog/RX quality issue inside or before PIPE Direct output
  • lane-specific reset/power-up timing issue
  • internal alignment / deskew behavior that I am misunderstanding
  • some required PIPE Direct control/sideband setting that I am missing

What I would like to ask

  1. In PIPE Direct x16 Gen1, if one lane shows valid K28.5 / K23.7 but the following TS2 symbols are corrupted, what should I check first on the R-Tile side?
  2. Are there any lane-specific PMA / RX / PIPE Direct controls that should be reviewed for this symptom?
  3. Is there any recommended way to determine whether this is:
    • a true lane analog/RX problem,
    • a deskew/alignment issue,
    • or a reset/bring-up sequence issue?
  4. Are there any known recommendations for validating lane integrity directly at the PIPE Direct output during Polling.Configuration?

1 Reply

  • Wincent_Altera's avatar
    Wincent_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi Jayden ,

    Lane 9 appears to decode the TS2 sequence correctly. Lane 8 shows COM (K28.5) and PAD (K23.7) correctly, but the symbols after that are unstable / corrupted.
    >> I see you are using x16, do you occupied all of the 16 lane ?
    >> Is it only lane 9 and lane 8 got problem and other lane act as what you expected ?
    >> Do you tested in gen2 ? or this issue only seeing in gen1 ?
     

    What mode of the pipe direct you trying to perform ? 
    Reset sequence or speed change ?
    IF I understand correctly from your case description , I assume you are using reset sequence.
    IF Yes, please check the signal sequence example under 
    https://docs.altera.com/r/docs/683501/25.1.1/r-tile-avalon-streaming-ip-for-pci-express-user-guide/pipe-direct-reset-sequence
    Please do ensure that those sequence are strictly been followed.

    Based on my experience and understanding , once the system entering polling.configuration substate, the a transmitter will stop sending TS1s and start sending TS2s, still with PAD set for the Link and Lane numbers. The purpose of the change to sending TS2s instead of TS1s is to advertise to the link partner that this device is ready to proceed to the next state in the state machine. It is a handshake mechaȬ nism to ensure that both devices on the link proceed through the LTSSM together. Neither device can proceed to the next state until both devices are ready. The way they advertise they are ready is by sending TS2 orderedȬsets. So once a device is both sending AND receiving TS2s, it knows it can proceed to the next state because it is ready and its link partner is ready too. 

    BUT I not sure what happening with your system, perhaps checking back the reset sequence can be a good start for us.

    Regards,

    Wincent_Altera