Perhaps I meant 'save much power' ...
You could also stall cpu execution in a multi-cycle custom instruction.
That gives you two 32bit values to play with as well, and a result.
One could be a bit-mask of IRQ bits (or similar), the other a maximum clock count. The result could be the bit(s) that woke the system up.
Quite possibly both these stalls cause the clock to almost all of the nios cpu to get suspended. Removing the instruction fetch and all the associated signal changes.
(hmmm... if you disable dynamic branch prediction and execute 'foo: bxx foo' from tightly coupled instruction memory then almost nothing is likely to change!)