Arria 10 (speed grade 3) has worse timing performance than Cyclone 10 GX (speed grade 5)
I am porting a design from a 10CX085YF67215G (Cyclone 10 GX, speed grade 5) to a 10AX022C4U19E3LG (Arria 10, speed grade 3). I am using Quartus Prime Pro Edition 21.3.
The design meets timing in the old Cyclone 10, but fails timing (worst setup slack = -0.590) in the more expensive Arria 10.
The timing failures are happening on a group of I/O pins (specifically, output pins). There are only setup timing failures (no hold failures, or anything else).
I have compared the timing reports generated by Timing Analyzer (for the same process corner "Slow 900mV 100C Model"). It seems like the Arria 10 is just slower everywhere. For example, the screenshots below show worse timing through the I/O cell to the output pin (there are no routing delays - just delays through the primitive I/O cell).
Cyclone 10CX085YF67215G:
Arria 10AX022C4U19E3LG:
Is this expected? One of our reasons for upgrading to Arria 10 (with a better speed grade) was to improve timing performance. Why am I seeing significantly worse timing performance?
In the Arria 10 device, the pins with the timing failures are mapped to I/O bank 2A, which is an LVDS I/O bank.:
Therefore, my understanding is that this should have decent GPIO timing performance (and certainly not worse than the Cyclone 10 device with speed grade 5).
As confirmed by @RichardTanSY_Altera , this is expected behavior. Although it is totally baffling that a more expensive device family, combined with a superior speed grade leads to worse timing performance, I am satisfied that my question has been answered.