The clock enable timing is similar to the timing of the D input of the register. If the register is using the rising clock edge, then the clock enable has to be valid and stable during the setup and hold window around that rising clock edge just like the D input has to be. The timing analysis will check this for you. You don't need to align the clock enable with the opposite edge of the clock. This works the same way whether the register is created with LPM_FF, RTL, or a DFF primitive in a schematic.
See the thread I referenced in my previous post for things applicable to your other questions. The originator of the other thread had a ripple clock to do a divide-by-n, but the considerations are the same for a gated clock created by a mux. Even if you have no reported setup or hold violations on cross-domain paths going to or from the mux output clock domain, you ought to consider the data-versus-clock-skew uncertainty that Rysc touched on and that I discussed in the other thread. One of my posts in the other thread says you can make the gated-clock warning go way with a clock setting, but that doesn't eliminate the skew consideration. If the output of the clock mux is global with no synchronous cross-domain paths going to or from that domain, then this skew won't matter.
Some device families (I don't remember whether this includes Stratix I) allow implementing a clock mux in the clock control block with the altclkctrl megafunction. (The clock control block also is the device structure used to get a signal onto global routing.) If you must use a clock mux, it is better to do it with a clock control block than with logic resources if the clock control block can handle the number of mux inputs you need. The device handbook documents what muxing the clock control block can provide, and I think it's an user guide for altclkctrl that documents how to use the megafunction.