I haven't looked at the DPA recently, but 400Mbps data streams isn't overly fast. Source synchronous double-data rate interfaces can do this, and they don't do any sort of oversampling or anything, i.e. it's all static timing analysis to center the clock on the data over PVT. At the really high rates is where this tends to fall apart and you need something that can re-calibrate. Of course, you're probably not going to get anyone to guarantee anything, but you're probably fine. Since it's an FPGA, you can design it the easier way at first and see what happens. It probably depends on how much testing you're going to do, what the application is(i.e. this might not be good for flight controls), and what you feel comfortable with. I've seen a number of designs where the designer just tries all the settings on the input delay chain(64-delays in Stratix II), and figure out how big their window was, then just center on that. If they had margin they were comfortable with, they went with it, and I haven't heard that anything went wrong.