Forum Discussion

PHe's avatar
PHe
Icon for New Contributor rankNew Contributor
2 hours ago

Agilex 5E ES Memory Performance Issues

Setup

We observed significant performance issues during sequential memory reads in HPS.

Target device is A5ED065BB32AE6SR0 from the premium dev kit using the GSRD Example.

Sysbench was used to benchmark the memory performance.

Test Results

For comparison, we also performed the test on an STM32 system (Arm Dual Cortex-A7 800 MHz) and the host PC (Ryzen 7 CPU).

 Agilex 5E ESSTM32MP157FHost PC, Ryzen 7
T0 (sequential read)480 MiB/s290 MB/s78972 MiB/s
T1 (sequential write)4058 MiB/s190 MB/s44749 MiB/s
T2 (random read)67 MiB/s373 MB/s3461 MiB/s
T3 (random write)52 MiB/s372 MB/s3608 MiB/s

 The test cases where executed as follows:

T0: sysbench --num-threads=1 --time=10 memory --memory-block-size=4MiB --memory-total-size=64GiB --memory-access-mode="seq" --memory-hugetlb=off --memory-oper=read run

T1: sysbench --num-threads=1 --time=10 memory --memory-block-size=4MiB --memory-total-size=64GiB --memory-access-mode="seq" --memory-hugetlb=off --memory-oper=write run

T2: sysbench --num-threads=1 --time=10 memory --memory-block-size=4MiB --memory-total-size=64GiB --memory-access-mode="rnd" --memory-hugetlb=off --memory-oper=read run

T3: sysbench --num-threads=1 --time=10 memory --memory-block-size=4MiB --memory-total-size=64GiB --memory-access-mode="rnd" --memory-hugetlb=off --memory-oper=write run

Observations

The sequential read operation on the Agilex 5 is significantly (factor 10!) slower than the write operation. Especially when comparing to other systems where the sequential read achieves about a third more throughput.

Sequential Read vs. Write: The sequential read operation on Agilex 5E ES is about 10x slower than the sequential write operation. On other systems, sequential read typically achieves about 30% higher throughput than write.

We found 2 possible issues with the ES devices in the Errata: 

  1. Degraded HPS EMIF performance with 2MB L3 Cache: https://docs.altera.com/r/docs/825514/current/agilextm-5-es-device-errata-and-user-guidelines/degraded-hps-emif-performance-with-2mb-l3-cache
  2. HPS EMIF read throughput less than target: https://docs.altera.com/r/docs/825514/current/agilextm-5-es-device-errata-and-user-guidelines/hps-emif-read-throughput-less-than-target

The workaround for 1. is to change the L3-cache to a value different to 2MB. However, this did not improve the performance any way.

For the second errata entry, there is no workaround. 

Question

Is the "HPS EMIF read throughput less than target" errata entry the primary cause of the degraded sequential read performance?

If confirmed, is this issue resolved in the series Agilex 5 Devices, and what performance improvements can we expect?

1 Reply

  • Hello PHe,

    You are correct that the performance degradation in the ES devices is due to the errartum you mentioned. This erratum has been addressed in the production devices and the performance is greatly improved.  We have included a benchmark example in the 26.1 Golden System Reference Design (HPS Baseline Example Design) that you can use to run standard benchmarks to see the performance.

    Regards,

    Sue