I'm using Quartus 9.0sp2.
Once I replicated the register manually I used QSF constraints to try to preserve the replicated OE registers:
set_instance_assignment -name PRESERVE_REGISTER ON -to *dat_ena*
set_instance_assignment -name REMOVE_DUPLICATE_REGISTERS OFF -to *dat_ena*
What was weird was that they still got optimized out.
However, the point is now moot, I was able to get the tool to replicate the register for me. I changed the logic back to having just a single output enable and it appears to be working correctly now although I'm not immediately sure what I did to fix it. Perhaps I had a code problem somewhere although I could see the path in timequest and there was no logic between the register and the pad but I must have been doing something wrong somewhere.