Stratix 10 Transceiver PHY bonding and multiple TX PLL clock inputs (for DisplayPort TX)
We are using a Stratix 10 L-Tile/H-Tile Transceiver Native PHY to implement a DisplayPort TX and RX. The transceiver is set as Basic (Enhanced PCS), TX/RX Duplex. The TX PMA is set as Non bonded, with 2 TX PLL clock inputs. One of the TX PLL clock inputs is driven by a fPLL and used for rates from 1.62G up to 13.5G. The other clock input is driven by two ATX PLLs (one working as Main PLL, the other as GXT Clock Buffer) and used for 20G rate. This works ok. The problem is that DisplayPort requires controlled skew between the 4 TX channels. That is, we need bonding for the 4 TX channels. If we set the TX PMA as "PMA only bonding" it seems we cannot have anymore multiple TX PLL clock inputs but just a single one. How can we use the Stratix 10 PHY to implement DisplayPort TX rates from 1.62G up to 20G with 4 bonded channels (= 4 lanes)?Solved59Views0likes7CommentsCyclone 10GX PCIe / Raspberry Pi
Hi, We have a PCIe controller using the Cyclone 10Gx with the PCIe hard IP. It works when connected to a Windows system but it isn't getting detected when connected on a Raspberry Pi 5 or CM5 system. On the Pi, i see that the LTSSM transitions from Detect.Active to Polling.Active to Polling.Compliance to link down. I think this suggests that the Pi isn't detecting any device on the other end. I tried isolating the power on sequencing by hooking up an external power supply to the PCIe card, but it didn't help. Any guidance would be much appreciated. Thanks!68Views0likes8CommentsTransceiver data corruption
I am trying to externally loopback a simple data-stream using the GTS on the Agilex 5, over an external QSFP loopback module. The GTS is configured as followed: External clock chip: Outputs 156.25 MHz clock verified using an oscilloscope. System PLL: Outputs 125 MHz to the GTS. GTS: Basic PMA Direct System PLL freq: 125 MHz PMA speed: 1250 Mbps PMA width: 10 TX/RX PLL/CDR: 156.25 MHz TX/RX core interface FIFO: single width TX/RX clock: System PLL clock /1 The RTL used to transfer data over TX: module top( input CPU_RESET_n, input REFCLK, output gts_o_tx_serial_data, output gts_o_tx_serial_data_n, input gts_i_rx_serial_data, input gts_i_rx_serial_data_n ); // gts logic gts_pma_cu_clk_i; logic gts_tx_reset, gts_rx_reset; logic gts_tx_reset_ack, gts_rx_reset_ack; logic gts_tx_ready, gts_rx_ready; logic tx_coreclkin, rx_coreclkin; (* noprune *) logic gts_tx_clkout, gts_rx_clkout; logic gts_rs_grant_i; logic gts_rc_rs_req_o; (* noprune *) logic gts_tx_pll_locked /* synthesis keep */; (* noprune *) logic gts_rx_is_lockedtodata /* synthesis keep */; (* noprune *) logic gts_rx_is_lockedtoref /* synthesis keep */; logic o_refclk2core; (* noprune *) logic [79:0] gts_i_tx_parallel_data /* synthesis keep */; (* noprune *) logic [79:0] gts_o_rx_parallel_data /* synthesis keep */; assign gts_pma_cu_clk_i = srcss_bank1_pma_cu_clk_o; assign tx_coreclkin = gts_tx_clkout; assign rx_coreclkin = gts_rx_clkout; assign gts_rs_grant_i = srcss_bank1_rs_grant_o; // reset sequencer signals logic srcss_bank1_rs_grant_o; logic srcss_bank1_rs_priority; logic srcss_bank1_rc_rs_req; logic srcss_bank1_pma_cu_clk_o; assign srcss_bank1_rs_priority = '0; assign srcss_bank1_rc_rs_req = gts_rc_rs_req_o; // system pll signals logic gts_systempll_refclk_rdy; assign gts_systempll_refclk_rdy = 1'b1; gts_top u0 ( // gts .gts_top_clock_bridge_rx_in_clk_clk (QSFP_REFCLK_p), .gts_top_clock_bridge_tx_in_clk_clk (QSFP_REFCLK_p), .intel_directphy_gts_0_i_pma_cu_clk_clk (gts_pma_cu_clk_i), .intel_directphy_gts_0_i_tx_reset_tx_reset (gts_tx_reset), .intel_directphy_gts_0_i_rx_reset_rx_reset (gts_rx_reset), .intel_directphy_gts_0_o_tx_reset_ack_tx_reset_ack (gts_tx_reset_ack), .intel_directphy_gts_0_o_rx_reset_ack_rx_reset_ack (gts_rx_reset_ack), .intel_directphy_gts_0_o_tx_ready_tx_ready (gts_tx_ready), .intel_directphy_gts_0_o_rx_ready_rx_ready (gts_rx_ready), .intel_directphy_gts_0_i_tx_coreclkin_clk (tx_coreclkin), .intel_directphy_gts_0_i_rx_coreclkin_clk (rx_coreclkin), .intel_directphy_gts_0_o_tx_clkout_clk (gts_tx_clkout), .intel_directphy_gts_0_o_rx_clkout_clk (gts_rx_clkout), .intel_directphy_gts_0_i_src_rs_grant_src_rs_grant (gts_rs_grant_i), .intel_directphy_gts_0_o_src_rs_req_src_rs_req (gts_rc_rs_req_o), .intel_directphy_gts_0_o_tx_serial_data_o_tx_serial_data (gts_o_tx_serial_data), .intel_directphy_gts_0_o_tx_serial_data_n_o_tx_serial_data_n (gts_o_tx_serial_data_n), .intel_directphy_gts_0_i_rx_serial_data_i_rx_serial_data (gts_i_rx_serial_data), .intel_directphy_gts_0_i_rx_serial_data_n_i_rx_serial_data_n (gts_i_rx_serial_data_n), .intel_directphy_gts_0_o_tx_pll_locked_o_tx_pll_locked (gts_tx_pll_locked), .intel_directphy_gts_0_o_rx_is_lockedtodata_o_rx_is_lockedtodata (gts_rx_is_lockedtodata), .intel_directphy_gts_0_o_rx_is_lockedtoref_o_rx_is_lockedtoref (gts_rx_is_lockedtoref), .intel_directphy_gts_0_o_refclk2core_o_refclk2core (o_refclk2core), .intel_directphy_gts_0_i_tx_parallel_data_i_tx_parallel_data (gts_i_tx_parallel_data), .intel_directphy_gts_0_o_rx_parallel_data_o_rx_parallel_data (gts_o_rx_parallel_data), // reset sequencer signals .intel_srcss_gts_0_o_src_rs_grant_src_rs_grant (srcss_bank1_rs_grant_o), .intel_srcss_gts_0_i_src_rs_priority_src_rs_priority (srcss_bank1_rs_priority), .intel_srcss_gts_0_i_src_rs_req_src_rs_req (srcss_bank1_rc_rs_req), .intel_srcss_gts_0_o_pma_cu_clk_clk (srcss_bank1_pma_cu_clk_o), // system pll signals .intel_systemclk_gts_0_i_refclk_rdy_data (gts_systempll_refclk_rdy) ); // syncronise reset logic gts_tx_system_reset; altera_reset_synchronizer #( .ASYNC_RESET (1), .DEPTH (2) ) gts_tx_rst_sync ( .reset_in (~CPU_RESET_n), .clk (gts_tx_clkout), .reset_out (gts_tx_system_reset) ); // generate test data stream logic [7:0] counter; logic [7:0] test_stream; always_ff @(posedge gts_tx_clkout or posedge gts_tx_system_reset) begin if (gts_tx_system_reset) begin counter <= 8'b0; test_stream <= 8'b0; end else begin counter <= counter + 1; case (counter) 8'd0: test_stream <= 8'h3C; 8'd1: test_stream <= 8'h7F; 8'd2: test_stream <= 8'h11; 8'd3: test_stream <= 8'h07; default: test_stream <= 8'h00; endcase end end // detect and transform idle data, and mark control symbols logic [7:0] idle_data_transform; logic control_symbol_detect; always_comb begin idle_data_transform = (test_stream == 8'h00) ? 8'hBC : test_stream; control_symbol_detect = (idle_data_transform == 8'h1C) || (idle_data_transform == 8'h3C) || (idle_data_transform == 8'h5C) || (idle_data_transform == 8'h7C) || (idle_data_transform == 8'h9C) || (idle_data_transform == 8'hBC) || (idle_data_transform == 8'hDC) || (idle_data_transform == 8'hFC) || (idle_data_transform == 8'hF7) || (idle_data_transform == 8'hFB) || (idle_data_transform == 8'hFD) || (idle_data_transform == 8'hFE); end // pipline combinational logic to ensure timings are met logic [7:0] idle_data_transform_r; logic control_symbol_detect_r; always_ff @ (posedge gts_tx_clkout or posedge gts_tx_system_reset) begin if(gts_tx_system_reset) begin idle_data_transform_r <= 8'b0; control_symbol_detect_r <= 1'b0; end else begin idle_data_transform_r <= idle_data_transform; control_symbol_detect_r <= control_symbol_detect; end end // --- 8b/10b Encoding --- // https://libsv.readthedocs.io/en/latest/encoder_8b10b.html logic [9:0] encoded_out; logic code_error; encoder_tx encoder_inst ( .i_clk (gts_tx_clkout), .i_reset_n (~gts_tx_system_reset), .i_en (1'b1), .i_8b (idle_data_transform_r), .i_ctrl (control_symbol_detect_r), .o_10b (encoded_out), .o_code_err (code_error) ); // pipeline encoded outputs to ensure timing is met logic [9:0] encoded_out_r; always_ff @(posedge gts_tx_clkout or posedge gts_tx_system_reset) begin if (gts_tx_system_reset) encoded_out_r <= 10'b0; else encoded_out_r <= encoded_out; end // send data over TX logic data_path_rdy_tx; always_ff @(posedge gts_tx_clkout or posedge gts_tx_system_reset) begin if (gts_tx_system_reset) begin gts_i_tx_parallel_data <= 80'b0; data_path_rdy_tx <= 0; end else begin data_path_rdy_tx <= gts_tx_ready && gts_tx_pll_locked; case (data_path_rdy_tx) 1: gts_i_tx_parallel_data <= {1'b1, 39'b0, 1'b0, 1'b1, 28'b0, encoded_out_r}; 0: gts_i_tx_parallel_data <= 80'b0; endcase end end endmodule This RTL passes timing standalone, but when signal tap is used, it does produce warnings. In SignalTap I take the following measurments: Instance TX: data: gts_i_tx_parallel_data[79:0] clock domain: gts_tx_clkout Instance RX: data: gts_i_rx_parallel_data[79:0] clock domain: gts_rx_clkout The issue I am seeing is intermitted failures upon bitstream-re-configure: On the TX side, after the encoder has encoded, the TX data reads as folowed: (EXPECTED): ... 283, 17C, 283, 17C, 183, 335, 0B1, 347, 283, 17C, 283, 17C, ... This is the expected pattern on the RX side (post-framing) However, in my experiments so far, I have found that it only sometimes works: Here are the framing results after 5 different re-flashes: (FAILURE): ... 283, 17C, 283, 17C, 383, 135, 0B1, 347, 083, 37C, 283, 17C, ... (FAILURE): ... 283, 17C, 283, 17C, 383, 135, 0B1, 347, 083, 37C, 283, 17C, ... (FAILURE): ... 283, 17C, 283, 17D, 183, 335, 0B1, 346, 283, 17C, 283, 17C, ... (SUCCESS): ... 283, 17C, 283, 17C, 183, 335, 0B1, 347, 283, 17C, 283, 17C, ... (FAILURE): ... 283, 17C, 283, 175, 1B1, 307, 083, 37C, 283, 17C, 283, 17C, ... If anyone has any idea of what else to try, it would be much appreciated!120Views0likes7CommentsIO Standard for GTS Transceiver REFCLK
Hi, We use on the Agilex 5 and 3 Transceivers with AC coupling. Agilex 5 requires CML, HCSL in Agilex 5 datasheet. However, Agilex 5 Premium kit uses LVDS Clock for some Transceiver REFCLK. Is it possible to use LVDS as REFCLK for XCVRs when we use with AC Coupling ?Solved27Views0likes1CommentAltera Agilex3 vs Xilinx Zynq & AD9363
Hi I would like to know if using an Altera Agilex 3/5 for an RF design makes sense. Xilinx has many materials that make an engineer's job easier. Are there also materials for Agilex 3/5? See Xilinx (3 projects only, but there is much more...): -> https://www.youtube.com/watch?v=HDEegX9IoUg -> https://github.com/hz12opensource/libresdr -> https://wiki.analog.com/resources/eval/user-guides/adrv936x_rfsom We are evaluating to build something like this: -> https://wiki.analog.com/resources/eval/user-guides/adrv936x_rfsom Thanks!70Views0likes4CommentsBehavior of 10 GX Avalon-MM Interface for PCI Express* IP Core when byteenable=16'h0000
Hi, we are using an Arria® 10 and Cyclone® 10 GX Avalon® Memory-Mapped (Avalon-MM) Interface for PCI Express* IP Core. During our tests we noticed some illegal PCIe packages generated presumable due to a wrong data length. We could tackle down the problem to the following basic setup: avalon_mm_master => 128 bit bus => PCIe-Core When we send the following sequence (two words), we get an illegal/unexpected PCIe transfers/behavior: burstcount = 2, address = address_a, data = some_data, byteenable=16'h0000 burstcount = 1, address = address_a+16, data=some_data, byteenable=16'hFFFF When we only send the second word everything works fine. This sequence originally comes from a qsys autogenerated 256=>128 width change in the interconnect somewhere upstream in our project. My question is: Do we miss something here? Does the IP-Core not allow for a first word to be completely disabled? If so, is there any (automatic) way to tell qsys / the interconnect to discard a leading all_bytes_disabled word? 5.3. 64- or 128-Bit Bursting TX Avalon-MM Slave Signals Best regards, Michael102Views0likes6CommentsMissing documentation for R-Tile Avalon Streaming FPGA IP for PCI Express
I cannot access this documentation anymore. I get sent to this link, which just presents me with an error: https://docs.altera.com/r/docs/683501/current Does anyone know what to do to access the latest documentation?Solved42Views0likes2CommentsHow to Prevent Agilex 7 F-tile PMA Direct PHY TX Lane Skew
Hi, We are implementing a 16-lane custom 1.0125 Gbps TX with F-tile PMA Direct PHY IP. The 16 lanes should be bonded together and they need to have a skew of less than 0.05 UI. However, we simulated the TX and found the serial output of each lane had skew of up to 11 ns, despite the fact that they had sychronized parallel input data. We have followed the suggested settings in user guide to enable system bonding, such as Number of Lanes=16, PMA Width=16, and Selected tx_clkout clock source=Bond Clock. For IP port connection, we use tx_clkout[0] as the source of all 16 tx_coreclkin. We also use it to clock all 16 parallel input data of TX in FPGA core fabric to make sure they are synchronized. Our IP Parameter and waveform is as attched. In testbench we feed parallel input data to all 16 lanes at the same time. We also assert TX PMA Interface Data Valid bit [38]. However, the serial output of each lane seems to start at different point of time, creating skew. Is this model behavior or actual hardware behavior? We tested on development board and our target sink is unable to lock to TX serial output as expected. Is there any way to eliminate the skew? Below is quartus version and our target board. Quartus Version : 25.3 Target Board : AGIC040R39A2E2VR0 Thanks wentsung46Views0likes2Comments