I think the HAL interrupt entry sequence is pretty good. I had a look at it and couldn't do much better than the compiler has[1]. If you want to write interrupt handlers in C you need to save all the registers which the C compiler will corrupt as otherwise your system won't work.
I assume you are measuring the time for interrupt 0 since that's fastest. By my calculations the current version should take about 59 cycles (I may be off by a few) to get to the start of the user routine for interrupt 0 (assuming cache hits each time).
So if you are measuring much more (you don't say how you are measuring) then the slowdown is probably due to memory latencies, for icache fills and for saving registers onto the stack. This is why its so important to put your exception handler and stack into fast onchip memories.
Once you've done that it may be possible to rewrite the interrupt handler in assembler to get even lower latencies for one interrupt. Here is my suggestion for a faster handler for interrupt 0. I think it will take 10 cycles to get to the point where you start saving registers. If you do this then you'll also have to change the C handler to handle interrupt 0 differently.
But if you're worrying this much about interrupt latencies then you also need to worry about the time for which interrupts are disabled, either by foreground code or because another interrupt is being handled.
.globl save_r2
// Define in a C file somewhere as: unsigned int save_r2;
alt_irq_entry:
rdctl et, estatus
andi et, et, 1
beq et, zero, software_exception
rdctl et, ipending
andi et, et, 1
beq et, zero, notirq0
// 10 cycles to get here
stw r2, %gprel(save_r2)(gp)
// Save as many other registers as you need. Your assembler code
// is allowed to use the registers you save here and et. If you're
// really careful you could use bt and ba as well but then you won't
// be able to debug this code.
// Your assembler code goes here
// Restore other saved registers
ldw r2, %gprel(save_r2)(gp)
eret
notirq0:
rdctl et, ipending
beq et, zero, software_exception
[1] I did manage to save about 5 clocks so this will be better in a future release.