Generate multiple clocks, shifted in phase by different amounts. This will reduce the operating frequency of any one clock whilst maintaining the high 'data' rate you require.
So, you need 200kHz * 5000 = 1GHz. Four PLL clocks at 250MHz, shifted by 90 degrees each, could be used to generate the PWM modulation you require. This will require a combinatorial final output stage whose timing will be tricky to analyse. If the absolute step size is critical this may be a problem.
The other way you could attack this is by using the transceivers present in Stratix parts. These will serialise data for you. They're (arguably) not intended to be use in such a way but that's not to say they can't be used in such a way. I'm not sure of the min/max operating frequencies the transceivers require but, in essence, this would allow you to operate at a greatly reduced clock frequency (e.g. 100MHz), present parallel data (e.g. 10-bit) and the hardware does the rest. Ensure you configure the transmitter to bypass the 8B/10B encoder.