Forum Discussion
Hi,
Though two DCOH slices are configured, traffic from both CAFU AXI‑MM ports converges into feed into shared traffic scheduling and routing logic inside the CXL IP. This part is required by the CXL protocol and not user configurable, ultimately limits D2H write bandwidth. The two AXI-MM ports are upstream of the shared scheduler. Once the downstream scheduling resources and link are saturated, more injection sources can't increase bandwidth. DCOH slices provide pipeline parallelism and latency hiding. Since slice selection is dynamic and shared, increasing DCOH slices improves utilization efficiency, but not peak throughput.
Address decoding is not the bottleneck thus address interleaving is not helpful here.
For Non‑cacheable traffic, the system is link‑limited, so bandwidth remains flat. Limit is not on the AXI-MM ports or the number of DCOH slices.
For Cacheable Owned traffic, enabling both AXI-MM ports increases additional contention in coherence, tag lifetime, and ordering enforcement reducing efficiency, causing bandwidth to decrease. This behavior is inherent to the CXL Type‑2 IP architecture and cannot be corrected by configuration, port usage, or address interleaving.
With the current architecture using a single Type‑2 endpoint, there is no direct solution to further increase peak D2H bandwidth. Achieving higher bandwidth would require an architectural change, such as multiple CXL links or multiple Type‑2 endpoints. AXI‑MM ports may still be used for traffic separation (for example, separating cacheable and non‑cacheable traffic), but not for bandwidth scaling. The conclusion is that the bandwidth limitation can't be improved simply by AFU-side changes.
Regards,
Rong