I did try with -1ns, but that didn't help.
I am observing interesting thing. I've routed sys_clk signal to debug output pin and set the PLL phase difference between sys_clk and sdram_chip_clk to -6ns. The scope shows me, that the difference between these two signals is -3.2ns. I didn't check that before, but if there's such a huge difference, then I believe I had way much smaller phase difference, when used PLL settings for -3ns. So far, the design is working fine with -6ns @ PLL settings, but as You may know, I still see problems in TQ. Maybe adding max delays will help.