Your duty cycle will be a function of your sampling period, so witha 4ns sampling clock, you're edges will look like that jump around by 4ns(they aren't in hardware, it's just that your sampling makes it look like this.)
If your PLL only has two outputs, 40MHz and 80MHz, is your sampling clock asynchronous to these domains? If so, your data capture is going to be corrupt. In generaly, I would sample the data with the 80MHz clock, and know that every tick is one period. There's no real reason to sample the clocks since, as I said, each tick is a clock edge and static timing analysis makes sure all your data meets setup and hold requirements. If you really were concerned there was something wrong with your clocks, my suggestion would be to route them out and sample them with a scope, which you've already done and verified that they look good. Maybe there's some information that you're trying to get that I don't understand.