--- Quote Start ---
I do understand what you're saying about timing; but if your receiver doesn't have a tHOLD time of 0 then you could get into trouble.
--- Quote End ---
Yes, but by specification, it is zero. You should also consider, that the clock is generated by the FPGA itself. So from the FPGA
internal view, the DCLK and DATA0 I/O cell delay subtracts from the setup and adds to the hold time of the receiving register. Hard to imagine, how a timing issue should arise here, assuming the internal register has a hold time comparable to other FPGA resources.
I also expect an assembly problem as most likely reason.