R-Tile Avalon Streaming PIPE Direct x16: Locks COM(K28.5) Symbols correctly but some lanes do not.

Question

Hello, I am implementing a custom soft PCIe/CXL link layer and LTSSM using R-Tile Avalon Streaming IP in PIPE Direct mode, configured as x16.

At the moment, link training does not reliably move forward because some lanes receive valid COM/K-code alignment, but the following ordered-set symbols are corrupted.

Environment

Device / board: AGIB027R29A
IP: R-Tile Avalon Streaming FPGA IP for PCI Express
Mode: PIPE Direct
Link width: x16
Current focus: Gen1 training / Polling / Configuration
Custom implementation:
- custom LTSSM
- custom symbol lock using COM (K28.5)
- custom TS1/TS2 decode logic

Symptom

In Polling.Active and Polling.Configuration , I can see that some lanes captures/decodes TS1/TS2 correctly, but some lanes do not.

For example, in the attached SignalTap screenshot:

Lane 9 appears to decode the TS2 sequence correctly.
Lane 8 shows COM (K28.5) and PAD (K23.7) correctly, but the symbols after that are unstable / corrupted.

From the screenshot:

Lane 9 example:
- K28.5, K23.7, K23.7, D24.0, D30.0 D00.0, repeated
Lane 8 example:
- K28.5, K23.7 are visible,
- but the following TS2 fields fluctuate and do not remain valid/stable.

So it looks like:

COM-based symbol lock is working at least partially
but after COM/PAD, the ordered-set contents on some lanes(random) are corrupted before my soft IP can decode them correctly

To verify whether this was caused by my own logic, I captured the affected lanes directly in SignalTap using the first raw 10-bit RX data from the PIPE Direct IP (`ln*_pipe_direct_pipe_rxdata_o`), before any symbol lock/decoding stage in my soft IP. I searched for the COM symbol directly in this raw 10-bit stream and confirmed that the corruption is already present at the PIPE Direct IP output. So this does not appear to be caused by my combinational decode logic; the raw RX data delivered by the IP is already corrupted on those lanes.

What I already checked

I already checked the following items carefully:

Gen1 rxdata interpretation
- I only decode valid 10-bit portions for Gen1
- I do not interpret the don't-care bits in rxdata[31:10] and rxdata[63:42]
rxdatavalid qualification
- TS decode / symbol shift only happens when rxdatavalid0/1 are valid
Sampling clock
- SignalTap capture is done in the corresponding lane RX clock domain
- not with a shared TX/fabric clock
Reset sequence
- pld_pcs_rst_n_i release is gated after per-lane tx_transfer_en_o
- I also reviewed cdrlock2data, reset_status_n, phystatus, powerdown sequencing
Deskew-related status
- active channels are detected

Current question

At this point, I suspect one of the following:

lane-specific analog/RX quality issue inside or before PIPE Direct output
lane-specific reset/power-up timing issue
internal alignment / deskew behavior that I am misunderstanding
some required PIPE Direct control/sideband setting that I am missing

What I would like to ask

In PIPE Direct x16 Gen1, if one lane shows valid K28.5 / K23.7 but the following TS2 symbols are corrupted, what should I check first on the R-Tile side?
Are there any lane-specific PMA / RX / PIPE Direct controls that should be reviewed for this symptom?
Is there any recommended way to determine whether this is:
- a true lane analog/RX problem,
- a deskew/alignment issue,
- or a reset/bring-up sequence issue?
Are there any known recommendations for validating lane integrity directly at the PIPE Direct output during Polling.Configuration?

wincent_altera · Answer

Hi Jayden ,Lane 9 appears to decode the TS2 sequence correctly. Lane 8 shows COM (K28.5) and PAD (K23.7) correctly, but the symbols after that are unstable / corrupted.&gt;&gt; I see you are using x16, do you occupied all of the 16 lane ?&gt;&gt; Is it only lane 9 and lane 8 got problem and other lane act as what you expected ?&gt;&gt; Do you tested in gen2 ? or this issue only seeing in gen1 ?&nbsp;
What mode of the pipe direct you trying to perform ?&nbsp;Reset sequence or speed change ?IF I understand correctly from your case description , I assume you are using reset sequence.IF Yes, please check the signal sequence example under&nbsp;https://docs.altera.com/r/docs/683501/25.1.1/r-tile-avalon-streaming-ip-for-pci-express-user-guide/pipe-direct-reset-sequencePlease do ensure that those sequence are strictly been followed.Based on my experience and understanding , once the system entering polling.configuration substate, the a transmitter will stop sending TS1s and start sending TS2s, still with PAD set for the Link and Lane numbers. The purpose of the change to sending TS2s instead of TS1s is to advertise to the link partner that this device is ready to proceed to the next state in the state machine. It is a handshake mechaȬ nism to ensure that both devices on the link proceed through the LTSSM together. Neither device can proceed to the next state until both devices are ready. The way they advertise they are ready is by sending TS2 orderedȬsets. So once a device is both sending AND receiving TS2s, it knows it can proceed to the next state because it is ready and its link partner is ready too.&nbsp;BUT I not sure what happening with your system, perhaps checking back the reset sequence can be a good start for us.Regards,
Wincent_Altera

jayden · Answer

Hello Wincent,Thank you for the reply.Yes, I am using the x16 configuration, and all 16 lanes are occupied. After the reset release sequence, the link proceeds to Polling.Active and Polling.Configuration on all 16 lanes. However, the problem is that the lanes that receive consecutive TS1/TS2 correctly are not stable. The set of “good” lanes changes every time I reboot the server with the FPGA card installed. In other words, the behavior looks very random.So to answer your questions:Yes, I am using all 16 lanes.It is not only lane 8 and lane 9. The lanes that work and the lanes that fail change after each reboot.At the moment, I am testing only in Gen1. This issue is currently being observed during Gen1 link training. After this is solved, my first goal is to bring the link to L0 and then speed up to Gen4.For the PIPE Direct configuration, I am using:PIPE Direct 16-channel1x16, Octet 0 - 8 lane, Octet 1 - 8laneGen4 configuration, currently down-training and debugging in Gen1Regarding your question about reset sequence or speed change:At the moment, I am focusing on the reset sequence path.I already built an RTL simulation environment and connected my soft IP in the same way as in hardware. In RTL simulation, the reset sequence itself appears to complete normally without any issue.&nbsp;However, in SignalTap, I sometimes see unrealistic behavior, for example pin_perst_n_o continuing to toggle, possibly due to setup timing violations around the TX clock domain. Because of that, it is difficult to trust SignalTap captures for validating the reset sequence directly.&nbsp;Do you have any recommendation for how to debug or validate the reset sequence in this situation, when SignalTap itself may be showing unreliable behavior due to timing issues?I also have one additional question.In the waveform shown in the following link:PIPE Direct Reset Sequencethis is a Gen1 reset sequence, but even in Gen1, both ln0_pipe_direct_txdatavalid0_i and ln0_pipe_direct_txdatavalid1_i appear to go high.However, in Figure 48, the PIPE Direct TX Data Path for Gen1 seems to show a format where only txdatavalid0 toggles.Because of this, I am not fully sure which behavior should be considered correct during the reset sequence for Gen1.Could you please clarify which one should be followed during Gen1 reset sequence?Should I treat the reset-sequence waveform as the expected behavior, or should I follow the interpretation shown in Figure 48 for Gen1 TX datapath formatting?Thank you again for your help.Best regards,Jayden

wincent_altera · Answer

Hi Jayden​ ,If referring to the Figure 49,

The pin_perst should asserted high at the beginning of the reset sequence.
Can you please show your timing report ? just want to see if the violation is valid or not (A printscreen will do)

Regarding your question on the txdatavalid shall be toggling or continuous be high, let me double confirm this on my place, get back to you shortly.

Regards,
Wincent_Altera

wincent_altera · Answer

Hi Jayden ,Regarding question for Figure 48 vs Figure 49.
if user need to perform reset-sequence , they shall follow figure49.
While figure48 is common practice for reference purpose only.
The reset release sequence diagram is primarily intended for the reset release relationship between the various signals For the actual data handling, the user should refer to the PIPE Direct TX Datapath and PIPE Direct RX Datapath figures. On the TX side, the tx_clkout is always at 500MHz. Depending on which speed you are in, the valid data on the TX path varies between Gen1 to Gen5 and needs to toggle accordingly On the RX path, the rx_clkout varies as per gen, so every clock cycle there is valid data between _0 and _1 signalsHope this clarified.Regards,
Wincent

jayden · Answer

Hello Wincent,Thank you for the clarification.I have a few follow-up questions to make sure we correctly implement the reset sequence based on Figure 49.1. TX data and txdatavalid behavior during the Figure 49 reset sequenceMy understanding is that the reset sequence is performed at Gen1 speed.If we follow Figure 49 exactly, should we keep both txdatavalid_0 and txdatavalid_1 asserted to 1, and drive TS1 Ordered Sets on txdata[9:0] during the reset release sequence?Or, even during the reset sequence, should txdatavalid_0 / txdatavalid_1 follow the Gen1 valid-data pattern described in the PIPE Direct TX Datapath Figure 48? I am still confused.As far as I know, after the reset sequence is completed successfully, link training should start from Detect and then proceed toward L0 according to the LTSSM(described in the figure above), where TS1/TS2 Ordered Sets are transmitted. I am not clear at which exact point or condition we should transition from the Figure 49 reset-sequence behavior to the normal PIPE Direct TX Datapath behavior described in Figure 48 / PIPE datapath figures.Could you clarify the expected transition timing or condition?2. Relationship between TX data transmission and cdrlockstatus assertionIn Figure 49, it looks like there is a timing relationship between points a through h/i/j/k, where tx_clkout becomes available, powerdown changes to P0 2'b00, and TX data can be driven, and points l/m/n, where cdrlockstatus is asserted and then reset_status_n_o becomes 1.From the timing diagram, it appears that cdrlockstatus becomes 1 after TX data transmission has already started.However, when I observe the current behavior using SignalTap, cdrlockstatus is already asserted to 1 on all lanes before TX data starts being transmitted. (I'm afraid the screenshot i have added might now be as clear, but you can probably see that cdrlockstatus for lane 0~8 are 1's and meanwhile there are txelecidle is 0xF for all lanes &amp; no tx valid, tx data sent since phystatus_o pulse has not been seen for all active lanes(0~8 lanes, x8 config).Is this an abnormal condition? Or is it expected behavior, and the Figure 49 timing diagram is only showing a conceptual relationship rather than a strict timing dependency?3. Validation request to check whether this is board-specificLastly, I would like to understand whether this behavior could be related to our board environment.If I provide the .sof file, the .stp file, and a simple guide mentioning a set of sticky/debug signals that make the behavior easy to observe, would it be possible for you to run the same validation on your board and check whether the same issue is reproduced?The main purpose is simply to confirm whether pin_perst_n_o also briefly goes low and then returns high during the reset sequence on your side, or whether this behavior is only observed on our board.This would help us determine whether the issue is related to the IP/reset sequence itself or something specific to our board/system environment.Best regards,Jayden

Forum Discussion

R-Tile Avalon Streaming PIPE Direct x16: Locks COM(K28.5) Symbols correctly but some lanes do not.

Environment

Symptom

What I already checked

Current question

What I would like to ask

6 Replies

Recent Discussions

Regarding Power-Up Sequence for Agilex 5

Cyclone V SoC 5CSXC6 Series GXB Utilization and Limitations

How to tell Quartus my Arria10 target system CLKUSR frequency is 100MHz?

Agilex 3 PLL in Source Synchronous mode ?

writing a word to cfm1 using on chip flash ip on max10