Todd,
For a specialized application where you are certain byte-access won't be required (code storage, a dedicated data buffer that you ONLY address as 32-bit words, DMA transfers, etc.), then you should be okay. However, if you want compiled C code to run reliably or debug (as you mention), its best to have byte-access to the memory.
If you're REALLY on an I/O budget the thing that comes to mind is reducing the number of data pins. This will have a corresponding hit in performance, but will help... Avalon's dynamic bus sizing feature means that it doesn't matter wheter your external RAM is 8/16/32 bits -- a wide (for example, 32 bit access to 8 bit external memory) will be divided into four sequential accesses to the external device automatically by Avalon; see the Avalon Specification for additional detail on how this works. Again, the trade-off is your throughput is reduced.