Forum Discussion

New Contributor

2 months ago

Solved

Agilex 7 R-Tile CXL IP: D2H write bandwidth does not scale with dual CAFU AXI-MM ports

Device: Agilex 7 I-Series AGI027 Software: Quartus Prime Pro 24.3 IP Core: CXL Type 2 IP Issue Description: We are attempting to increase CXL Device-to-Host (D2H) write bandwidth by utilizing bot...

Cxl

Pcie

RongY_altera
1 month ago
Hi,

Though two DCOH slices are configured, traffic from both CAFU AXI‑MM ports converges into feed into shared traffic scheduling and routing logic inside the CXL IP. This part is required by the CXL protocol and not user configurable, ultimately limits D2H write bandwidth. The two AXI-MM ports are upstream of the shared scheduler. Once the downstream scheduling resources and link are saturated, more injection sources can't increase bandwidth. DCOH slices provide pipeline parallelism and latency hiding. Since slice selection is dynamic and shared, increasing DCOH slices improves utilization efficiency, but not peak throughput.

Address decoding is not the bottleneck thus address interleaving is not helpful here.

For Non‑cacheable traffic, the system is link‑limited, so bandwidth remains flat. Limit is not on the AXI-MM ports or the number of DCOH slices.

For Cacheable Owned traffic, enabling both AXI-MM ports increases additional contention in coherence, tag lifetime, and ordering enforcement reducing efficiency, causing bandwidth to decrease. This behavior is inherent to the CXL Type‑2 IP architecture and cannot be corrected by configuration, port usage, or address interleaving.

With the current architecture using a single Type‑2 endpoint, there is no direct solution to further increase peak D2H bandwidth. Achieving higher bandwidth would require an architectural change, such as multiple CXL links or multiple Type‑2 endpoints. AXI‑MM ports may still be used for traffic separation (for example, separating cacheable and non‑cacheable traffic), but not for bandwidth scaling. The conclusion is that the bandwidth limitation can't be improved simply by AFU-side changes.

Regards,

Rong

RongY_altera

Contributor

1 month ago

Hi,

Though two DCOH slices are configured, traffic from both CAFU AXI‑MM ports converges into feed into shared traffic scheduling and routing logic inside the CXL IP. This part is required by the CXL protocol and not user configurable, ultimately limits D2H write bandwidth. The two AXI-MM ports are upstream of the shared scheduler. Once the downstream scheduling resources and link are saturated, more injection sources can't increase bandwidth. DCOH slices provide pipeline parallelism and latency hiding. Since slice selection is dynamic and shared, increasing DCOH slices improves utilization efficiency, but not peak throughput.

Address decoding is not the bottleneck thus address interleaving is not helpful here.

For Non‑cacheable traffic, the system is link‑limited, so bandwidth remains flat. Limit is not on the AXI-MM ports or the number of DCOH slices.

For Cacheable Owned traffic, enabling both AXI-MM ports increases additional contention in coherence, tag lifetime, and ordering enforcement reducing efficiency, causing bandwidth to decrease. This behavior is inherent to the CXL Type‑2 IP architecture and cannot be corrected by configuration, port usage, or address interleaving.

With the current architecture using a single Type‑2 endpoint, there is no direct solution to further increase peak D2H bandwidth. Achieving higher bandwidth would require an architectural change, such as multiple CXL links or multiple Type‑2 endpoints. AXI‑MM ports may still be used for traffic separation (for example, separating cacheable and non‑cacheable traffic), but not for bandwidth scaling. The conclusion is that the bandwidth limitation can't be improved simply by AFU-side changes.

Regards,

Rong

Forum Discussion

Agilex 7 R-Tile CXL IP: D2H write bandwidth does not scale with dual CAFU AXI-MM ports

Recent Discussions

SysID Timestamp

Cyclone 10 GX Transceiver Power-Up Calibration Time (~353 ms) Analysis Request

AVST FIFO and AVST Demultiplexer IP Simulation Behavior

User controlled burst refresh

F-tile ethernet hard ip in agilex7