One possibility is that the clock crossing is OK and your bug lies somewhere else.
Does your design meet timing constrains?
Another possibility is that some of the logic in the dcfifo is running at the limits the FPGA is capable of.
The transfer from clkb to clka only has 2.5 ns to work with.
And when you generate a dcfifo with 2 or more stages, Quartus assumes the clocks are asynchronous and inserts a false path exception, so no timing analysis is performed on the transfer from clkb to clka.
Since the two clocks have a know relative phase, then the timing in the clock crossing can be analyzed.
You don't need to/shouldn't use asynchronous design techniques in this case. Try to simply use two registers: reg@clkb --> reg@clkA.
TimeQuest will analyze the design and tell you if it's meeting constraints or not.