Unfortunately the "extra" instruction after the ldhuio is not due to the macro.
The gcc compiler shipped with Nios II 1.0 and Nios II 1.0.1 isn't very good about
understanding that the 8 and 16 bit load unsigned instructions (I/O or normal) zero
out the top bits and so it puts in "extra" andi instructions when not needed.
I've reported this to our compiler engineer as an enhancement request.
I believe this will be fixed in Nios II 1.1 (due out late November) when we
upgrade to a new version of gcc.
As for the warning in the documentation about not using bit 31 to bypass the data cache,
I had them put that in to help make it easier for a possible future MMU option for Nios II.
Being the CPU architect, it's my job to plan ahead.