Well, sounds like you have a lot going on here. Let me see if I can address your concerns.
The output max delay (OMD) and output min delay (OmD) constraint values I mention above are the proper values for any alignment (edge or center), and for SDR or DDR. The slack calculation will take into account the Setup and Hold Relationships between the launch clock (which TQ knows since it is inside the device) and the latch clock (which you specify in your set_output_delay constraints).
Since you have to meet your 1 ns hold requirement, you may run into an issue using edge aligned, so yes center aligned (or close to that) would give you better timing margin.
Since you are driving the latch clock out of the device with a PLL, you have the ability to balance your setup and hold slack by changing the PLL clock output's offset value. To figure out how much to adjust your offset (from your current 180 degrees), use (hold slack - setup slack)/2.
If you are using any of Altera's newer device families, the PLL must be fed by a dedicated clock input, period. At least that has been my experience.
If you are not relying on the PLL to change clock frequencies for your source synchronous output clock (which it sounds like you are not), then the best way to invert the clock to drive this interface is to connect it to an ALTDDIO_OUT MegaFunction (1-bit width). Connect the clock to the clock input and the HI data input to GND and the LO data input to VCC, and this will not only invert the output clock for you, but will have the added benefit of having the lowest possible skew between your clock and data outputs. Of course the disadvantage of not using a PLL is that you lose the ability to balance your setup and hold slack by adjusting the clock offset.