It may be the clock glitch issue. The core problem is that a toggle flip-flop gets inverted in the gray counter, and there's no way for it to recover from this. So every time that counter gets a signal to increment, it instead decrements(and the FIFO gray counters should always increment). I would signaltap those counters and study them. Since they're gray coded, it's not obvious which way they're counting. I signaltapped a long stretch of them, and looked at the sequence when it was working, then realized when it stopped working it was going in the opposite order.
By the way, I "think" this is fixed in Q9.1, where the gray counters would do one count in the wrong direction and then recover back to the right direction. This would still corrupt your data stream, but only a word or two rather than a continous problem. Of course, more fundamentally, having a clock glitch is a bigger problem and should be fixed at the source.
(Is your clock coming from a PLL? If it is, there should be no glitch. The only cases I've seen is when they bypassed the PLL and had a noisy external clock. In one case, adding the PLL cleaned it up...)