Don't use out as a clock, but as a clock enable to the MRAM, and have in(your system clock) feed the MRAM's clock. You'll get the same affect in that the MRAM will only be clock enabled for a certain number of clocks.
It's very important to note gate your clocks whenever you can. When everything has aligned rising edges, then the clock skew essentially cancels out, over both min and max timing models. If one clock has extra delay, you now have skew. So not only are you worried about your data path delay being faster than the desired clock period, you have to worry about it being longer than the clock skew, over both fast and slow corner models. You not only have to monitor setup slack, but also hold slack. (If that all doesn't make sense, that's exactly why to avoid gated clocks, since you then don't have to understand these topics.) Anytime you think of making a modification to the clock, see if it can be done with a PLL or a clock enable instead, since they don't delay the clock edges(assuming all clocks go through the PLL). There are certainly cases where this can't be done, but in many, including this one, you can work around gating your clocks.