I made this way in a past project:
assign avs_s1_readdata = 24'hz;
(...)
.wb_rst_i (avc_c1_reset),
.wb_adr_i({2'b0, avs_s1_address}),
.wb_dat_i(avs_s1_writedata),
.wb_dat_o(avs_s1_readdata),
.wb_we_i (avs_s1_write & ~avs_s1_read),
.wb_stb_i (avs_s1_chipselect & (avs_s1_write | avs_s1_read)),
.wb_cyc_i (avs_s1_chipselect),
.wb_ack_o (avs_s1_waitrequest_n),
If I remember correctly I used the address remapping trick and a fake 32bit width because of a Avalon bus bug. Infact Avalon always makes 32bit accesses, even if you use 8 or 16bit macros. For example IORD_8DIRECT would actually make 4 8bit reads in order to complete a 32bit bus word.
This can corrupt your data if you have a peripheral with FIFOs (like UART usually have) which pop a data byte for every read access to the rx register.
Cris