There are two things contributing to those 10 ns; and two actions you need to take.
First, there's a considerable delay from the clock input pin to the flip-flop's register pin, due to the clock distribution tree.
This can be compensated by using a PLL in normal compensation mode: feed the input clock to the PLL and use the PLL's output clock to drive the logic.
Second, there's also a considerable from the internal registers' output and the output pin.
This can be greatly reduced by using the Fast Output Registers which sit in the IOB itself.
You can get Quartus to use them by setting proper output delay constrains or by using the assignment editor.