A large(r) device will have more clock skew (as the distances to cross becomes longer). E.g. I have IO clock skews of around 2.8 ns for a Stratix II GX EP2SGX60F1152C3 device. And unless you play around and assign optimal pin settings yourself the compiler will do a 'lazy' (and quite often a 'lousy') job. If you back-annotate the pins you can see what the compiler produced.
The fitter works with or around clock skew, whatever is appropriate. Clock skew is inherent to fitting your design into some physical device. So you can't turn it off. On the opposite if you perform functional simulation you use the result of synthesis and there is no clock skew, but no notion of speed either.