--- Quote Start ---
So, even though it now works, I would like to know *why* this fixed it before we go to production. Any clues? Sorta seems like it needs a pull-up, pull-down, or buffer(?).
--- Quote End ---
It fixed it because you changed the design, signals got re-routed and the problem went away. The compiled design will be different from one compile to the next even with the most minor of changes.
There is no reason to believe the problem wont come back on a later compile, or when the chip heats up/cools down, or when you use a different chip. Without timing specs and analysis you're leaving yourself open to timing problems.