Actually, if you look at the timing trace you'll see that there is a relatively long time between the interrupt request and your ISR running.
Throw away the Altera interrupt code and write your own.
Even if you have to save/restore the registers it shouldn't take anywhere near the length of time that code takes.
If you can use an 'alternate register set' then you should be able to get interrupt entry/exit down to a few clocks.
(even without it you might manage to write an asm ISR that doesn't use any registers that the main code uses)
I also presume you are not running the JTAG debugger at all - it wouldn't surprise me if that doesn't take some interrupts.
You might also want to look at adding a short fifo (a couple of items) on the input and output - that will mean that the system will handle a longer interrupt latency.