You have made a classic synchronization error.
You have described a simple enabled flip-flop. The data and the enable are assumed synchronous to the clock.
Now, in your test bench, you create a clock, which is fine. But your test stimulus -- the data and the enable, are generated asynchronous to that clock. By doing the wait for 10 ns; between changing your data, it looks like they change on the clock edge, but they really don't.
Rather than doing wait for 10 ns; put a wait until rising_edge(clock); before each data change:
enable_process : process --it keeps enable signal cycling every clock period
begin
wait until rising_edge(clk);
enable <= '1';
wait until rising_edge(clk);
enable <= '0';
wait until rising_edge(clk);
enable <= '1';
wait until rising_edge(clk);
enable <= '0';
-- etc
end process enable_process;
D_in_process: process --random inputs
begin
wait until rising_edge(clk);
wait until rising_edge(clk);
wait until rising_edge(clk);
wait until rising_edge(clk);
wait until rising_edge(clk);
D_in <= "001";
wait until rising_edge(clk);
D_in <= "010";
wait until rising_edge(clk);
D_in <= "011";
wait until rising_edge(clk);
D_in <= "100";
wait until rising_edge(clk);
D_in <= "101";
wait until rising_edge(clk);
D_in <= "110";
wait until rising_edge(clk);
D_in <= "111";
end process D_in_process;
So your next question: "How is this different?"
Answer: it has to do with how VHDL schedules updates.
When you do the wait for 10 ns; thing, the process suspends for 10 ns, and then immediately after waking back up it makes the assignment. But what if other processes are doing the same thing? You don't know the order which the scheduler has chosen to do these updates. Maybe the clock updates before the data, maybe it updates afterwards. You don't, and can't, know.
So waiting until the edge of the clock, rather than just waiting some time that happens to be the clock period -- just like you do with code that you expect to synthesize into flip-flops! -- you synchronize all of those various data assignments to that clock.
Here's what happens with the clock-edge paradigm. Each of the data-assignment processes hits something like D_in <= "111"; and D_in changes to "111" as you expect. Then the process suspends until the rising edge of the clock. Now consider that EACH of those data and enable and whatever assignments do the same thing.
As noted, in your unit under test, you have the standard synchronous flip-flop description. What happens there? It waits until the rising edge of the clock. At that instant, it looks at all of the right-hand sides of the assignments and evaluates them. After they are all evaluated, only then does the update of the left-hand side occur.
Try it and see :)