The timing analysis doesn't show errors with your code. But this only means, that the internal timing is correct. The problem is however with the external signals, e.g. read_en. You assert read_en at the falling_edge of clk. This 5 ns setup time relative to clk input is apparently not sufficient under all conditions.
Primarly, this is an simulation rather than a real synthesis problem. In a real design, read_en is either a clk related signal, then it has a known timing an can be considered in timing analysis. Or it's an unrelated signal, then it must be registered before entering the synchronous logic.
P.S.: I'll try to explain exemplary, why adress isn't reset to 0 in this case;
--- Quote Start ---
if(readEN = '1' and first_read = '1') then
address <= "000000000000000";
first_read <= '0';
--- Quote End ---
The above expression (together with more code, of course) is translated into combinational logic that feeds several DFF. This logic is composed of several logic elements and routing resources inbetween and involves a certain delay. Unfortunately, the delay is varying between individual logic terms. You have e.g. one DFF for
first_read and 14 for
address. If read_en is late related to the rising edge of clock, it may happen, that it's assertion arrives at the
read_en DFF before the clock edge and at several or all
address DFFs after the clock edge. If this happens, address won't be reset to zero, at worst case, it can be reset partially and take arbitrary values.
If you register respectively synchronize
read_en, you assure that it is arrives at all DFFs of the process in the same clock cycle, so the observed problem can't occur.