Your code isn't really easy to follow... but I see that you have in fact 4 different registers for each port, so you shouldn't read back at the same address that you are writing to. If I can understand this correctly, to write to your output 0 you need to write to register 1, but to read back the input 0 you need to read register 0.
Anyway I suggest to put up some Signaltap probes in your component and try to figure out what is happening, it is difficult to do with just looking at the code.
Is there a reason why you need so many pipelining stages to read back the input ports?