--- Quote Start ---
means that reg_in, in1 and in2 must be same width?
--- Quote End ---
No it's not required. I thought, it would be the case, because VHDL wouldn't allow this assignment, Verilog however does. But the explanation is much easier. In1 is simply assigned to the part of reg_in, that's never read. You have to double the width of reg_out to see the in1 data.
I'm not absolutely sure, if you know, what a concatenation is? With DATA_WIDTH = 4, it creates a bit-vector of length 8. If you only read the lower 4 bits, in1 is ignored.