Forum Discussion
Hi @Ash_R_Intel, thanks for the response.
Yes, clk1 is on a GCLK via a CLKCTRL block, and it fans out to core logic as well as to the reference clock input of the PLL. That is as intended, and shouldn't be an issue, unless I'm missing something here (?).
I'm not expecting the PLL to compensate for the delay of the clk1 distribution network. We're agreeing on that. The clk1 distribution network has an uncompensated insertion delay of about 3.8 to 3.9 ns, and that's fine.
What I am expecting is that the PLL will compensate for the delay of the clk2 distribution network only. And as such, clk2 should end up in phase with clk1, since clk1 is the reference clock input of the PLL. In other words, with the PLL compensating properly, there shouldn't be any significant skew between the endpoints of the clk2 network and the endpoints of the clk1 network.
Now, in the timing report snippet I provided, we can see that the total insertion delay of clk1 is 3.908 ns to the clock input of ff4, and 3.784 ns to the reference clock input of the PLL. That's plenty close enough, practically zero clock skew (just 0.124 ns), which is as expected. We'd expect to see very little clock skew between different endpoints of a given clock network, clk1 in this case, and that's what we're seeing. So far so good.
Now what doesn't look right is downstream of that, the compensation loop of the PLL. The reference clock input of the PLL, again, is an endpoint of the clk1 network, which arrives at 3.784 ns. So far so good. What I would then expect to see downstream is that the PLL's compensation loop in this topology should make it such that the endpoints of the clk2 network (such as the clock input of ff3) should also arrive at around 3.784 ns, plus/minus very little skew. And that's not at all what we're seeing. What we are seeing in the report is that clk2 arrives at the clock input of ff3 at -1.330 ns. That's about 5 ns earlier than it should be. Not good, and that's what remains unexplained.
Does my analysis make sense? Am I missing something?
BTW, about set_max_skew, I don't think that applies here (although I did try it anyway, and it didn't work). As I understand it, set_max_skew pertains to registered paths or ports, not to clocks. Although if I'm wrong or if I missed your point, please correct me and show an example of what you mean.
About generating both clocks as outputs of the same PLL, I understand how that could be a possible work-around. But there are practical reasons why that's not an option in my application, and I don't know of a reason why what I'm doing shouldn't work. I've used this clock topology in Xilinx devices before with great success, and Altera Arria 10 documentation also seems to suggest that it should work. And Quartus isn't complaining about it, it just produces timing reports that don't seem like it's working right. So I'd very much like to get to the bottom of it, figure out what's wrong, and make it right. Any more thoughts?
Thanks,
-Roee