Forum Discussion

Heyang's avatar
Heyang
Icon for New Contributor rankNew Contributor
2 months ago

Delay in CDR recover

 Hi friends,

I hope this message finds you well.

I am currently encountering some issues with the burst-mode CDR implementation using ALTGX.

Test setup:

In my setup, I am using an optical modulator to generate burst-mode optical signals. The burst-mode light (on/off) is sent directly to an SFP+ module mounted on a Terasic HSMC SFP daughter board, connected to a DE4 (Stratix IV) development board.

Each burst frame consists of approximately 109 µs of light on and 109 µs off. To properly decode the received signals, the transceiver is supplied with a reference clock (`rx_cruclk`) that is phase-recovered from the transmitter. The ALTGX CDR is switched between `rx_locktoref` and `rx_locktodata` in synchronization with the optical signal transitions:

* When the signal is **off**: lock to `rx_cruclk`
* When the signal is **on**: lock to data

**Observed issue:**

Switching between `rx_locktoref` and `rx_locktodata` does help the CDR lock within the 109 µs burst duration, but the locking time is longer than expected. It takes around **1.3 µs** for the CDR and word aligner to fully synchronize with the received data stream. This is much slower than anticipated, considering that the reference clock provided to the ALTGX is quite stable, and the phase difference between `rx_cruclk` and the transmitter clock fluctuates within only about 3 degrees.

We also experimented with manually delaying the start of the CDR locking process to better align it with the beginning of each burst frame, but the locking time consistently remained around 1.3 µs. Additionally, we tested by launching a continuous optical signal into the SFP while maintaining switching between rx_locktoref and rx_locktodata. In this configuration, the CDR synchronized to the incoming data within a few nanoseconds, as expected.

We suspect that when the optical signal is absent, the CDR inside the SFP module drifts, causing the output serial stream to deviate significantly. As a result, the ALTGX receiver cannot lock onto the drifted signal. We tested several SFP modules and selected the one that delivered the best performance, achieving approximately 1.3 µs locking time.

Would a 1.3 µs CDR locking time be expected under these conditions? Since our implementation uses a reference-assisted CDR with a phase-locked reference clock derived from the transmitter, we anticipated a faster locking response. The project is still ongoing, so I’m not yet certain whether I can share the source code, but please let me know if reviewing it would help with further debugging.

Thank you very much for your assistance—it’s greatly appreciated.

Best regards,

Hank

9 Replies

  • FvM's avatar
    FvM
    Icon for Super Contributor rankSuper Contributor

    Hi,

    I fear that neither SFP module nor RX PHY is designed to support fast locking to burst signals, it's just no regular use case.

    • Heyang's avatar
      Heyang
      Icon for New Contributor rankNew Contributor

      Hi Frank,

      Thank you for your support again.

      For context, we are working on implementing a passive optical network (PON)-like optical switching scheme. This approach aligns with the description in an Altera note, which reports a CDR lock time of 267.5 ns in Table 1. The same note also states that “the Stratix IV and Stratix V families of FPGAs have integrated burst-mode CDR SERDES that can support a variety of optical transceivers available in the market.” This suggests that some of the Altera-provided resources could potentially be reused or adapted to enhance our current implementation.

      Given that our setup employs a highly stable reference clock, we believe it should be possible to achieve a lock time even shorter than 267.5 ns. However, we have not been able to locate a suitable example project or detailed documentation that demonstrates how the burst-mode CDR is implemented on Stratix IV devices with the integrated transceiver.

      Would you be able to advise where we might find the relevant reference designs or documentation? Your assistance would be greatly appreciated.

      Best regards,

      Hank

      • FvM's avatar
        FvM
        Icon for Super Contributor rankSuper Contributor

        Hi Hank,
        thank you for explaining about PON burst mode requirements. My personal optical network experience is limited to continuous fibre channels, e.g. 10G Ethernet.

        Provided that mentioned Stratix IV sync time of about 300 ns is sufficient for your application and 1.3 us is not, it sounds like you need to get a burst mode enabled SFP+ 10G module. There are apparently some on the market, but I must confess they didn't occur to me before I explicitely searched for.

        I see by the way a maximum receiver settling spec of 800 ns in IEEE 802.3-2018 75.5.2 Receiver optical specifications for 10G/10G EPON.

        Regards
        Frank

  • CheepinC_altera's avatar
    CheepinC_altera
    Icon for Regular Contributor rankRegular Contributor

    Hi Hank,

     

    Thank you for reaching out. To make sure we’re aligned, I’d like to clarify a couple of points regarding your inquiry on the SIV CDR lock time:

     

    1. Are you currently using CDR auto-lock mode or manual mode?
    2. Have you had a chance to compare the observed CDR lock time with the specifications in the SIV device datasheet? Specifically:
    • tLTR, tLTR_LTD_Manual, tLTD_Manual, tLTD_Auto in Table 1–23: Transceiver Specifications for Stratix IV GX Devices
    • Figure 1–2: Lock Time Parameters for Manual Mode
    • Figure 1–3: Lock Time Parameters for Automatic Mode

    Please note that these timings refer to the duration required for the CDR to recover valid data and do not include the time for word alignment.

     

    Feel free to share any additional details or questions you may have—we’re here to help.

    • Heyang's avatar
      Heyang
      Icon for New Contributor rankNew Contributor

      Hi Cheepin,

      Thanks for your reply.

      The CDR is currently running in manual mode, meaning we switch between LTD and LTR depending on the optical signal availability.

      I’ve checked the parameters for CDR switching time in the specification, but they seem not directly applicable to our case because:

      • We are not toggling rx_analog_reset due to its long recovery delay.
      • Our frame size is 109 µs, so LTR remains high for about 109 µs in each frame gap, which is much longer than tLTR_LTD_manual.
      • The spec only lists the maximum tLTD_manual (4 µs) without a typical value.

      Here are our test observations:

      1. With continuous light and manual switching between LTR and LTD, data recovery occurs within ~30 ns (with a very stable reference clock).
      2. When turning the light on/off and switching LTR/LTD synchronously, data recovery takes about 1.3 µs.

      In our tests, manually switching between LTR and LTD appears to break the CDR inside the XCVR, so it no longer locks to incoming data during LTR. When returning to LTD, it relocks within tens of nanoseconds — much faster than our previous results.

      Our hypothesis is that the SFP module’s internal CDR loses lock when the light drops, causing incorrect retimed data when the signal returns. This in turn may mislead the XCVR’s CDR. We recently purchased a customized SFP module with the onboard CDR bypassed at the factory. The new SFP module without CDR should help eliminate this issue. Does this make sense to you?

      By the way, could you share how long it typically takes for the word aligner to lock to correctly timed data? I assume it should be in the nanosecond range rather than microseconds. We are currently toggling the aligner enable (high–low–high) to reset it after setting the CDR back to LTD. Do you think this approach might extend the alignment time, or would you recommend a faster method?

      Thank you very much.

      Best regards,

      Hank

      • CheepinC_altera's avatar
        CheepinC_altera
        Icon for Regular Contributor rankRegular Contributor

        Hi Hank,

         

        Thank you for sharing the details. Please see my comments below:

         

        We are not toggling rx_analog_reset due to its long recovery delay.
        [CP] The diagram shows signals after reset release. You likely do not need to toggle rx_analog_reset.

         

        Our frame size is 109 µs, so LTR remains high for about 109 µs in each frame gap, which is much longer than tLTR_LTD_manual.
        [CP] tLTR_LTD_manual specifies only a minimum value. As long as your duration exceeds that minimum, it should be fine.

         

        The spec only lists the maximum tLTD_manual (4 µs) without a typical value.
        [CP] Lock-to-data time can vary based on factors such as data rate, transition frequency, and signal integrity. This variability explains why a typical value is not provided—it depends on specific use cases.

         

        Our hypothesis is that the SFP module’s internal CDR loses lock when the light drops, causing incorrect retimed data when the signal returns. This in turn may mislead the XCVR’s CDR. We recently purchased a customized SFP module with the onboard CDR bypassed at the factory. The new SFP module without CDR should help eliminate this issue. Does this make sense to you?
        [CP] That seems plausible. The XCVR CDR will attempt to track incoming data when switching to LTD mode. If incorrect data rate is presented, the CDR may lock incorrectly. Note that my understanding is that typically there is a threshold—if the deviation is too large, the CDR will lose lock even in LTD mode.

         

        By the way, could you share how long it typically takes for the word aligner to lock to correctly timed data?
        [CP] This depends on your XCVR configuration. A functional simulation with your specific setup is recommended to estimate the number of parallel clock cycles required.

         

        We are currently toggling the aligner enable (high–low–high) to reset it after setting the CDR back to LTD. Do you think this approach might extend the alignment time, or would you recommend a faster method?
        [CP] I believe you are referring to rx_enapatternalign. Generally, this should be asserted only when a valid alignment pattern is present to avoid false alignment. Your method should not significantly increase alignment time, but you can confirm via simulation.

         

        Please let me know if you have further questions. Thank you.