Low coremark benchmark score on Cyclone V (ARM Cortex A9)
Hi,
I'm using a low speed grade (-C8) Cyclone V with HPS (single core ARM Cortex A9).
IC-5C SE B A4 U 19 C 8 S-TMP
The code development is done on bare-metal (freeRTOS) with MPL as bootloader. All boot configuration is generated through Quartus.
When running a benchmark (coremark) to validate our configuration, we get an incredibly low score of 380.
When running the same test on a dev board which is fitted with a dual core running at 800 MHz (Linux), we get a score of 5000.
Thus, a single core score of 2500.
Scaling down to our 600 MHz (75%) yields an expected value of 1875.
We have reviewed the configuration and everything seems to be OK
- NEON is on
- MMU is on (flat address space)
- branch prediction is on
- all caches are on
This particular benchmark is designed to test core performance (mostly arithmetical operations on small data). It means that all code and data will fit into the cache memory which discards L3/L4 as bottlenecks. Despite this we have reviewed the clocking scheme which looks like this:
MPU = 600 MHz
L3 MP = 150 MHz
L3 SP = 75 MHz
L4 MP/SP = 100 MHz
DDR = 400 MHz
I must be missing something very important here. Any ideas?