Hi sstrell,
Thank you very much for you response. The following is my global clock assignments:
set_instance_assignment -name GLOBAL_SIGNAL GLOBAL_CLOCK -to gen_clk_buffs[0].inst_clkbuff|inst_clock_buffer|stratix10_clkctrl_0|outclk
set_instance_assignment -name GLOBAL_SIGNAL GLOBAL_CLOCK -to gen_clk_buffs[1].inst_clkbuff|inst_clock_buffer|stratix10_clkctrl_0|outclk
set_instance_assignment -name GLOBAL_SIGNAL GLOBAL_CLOCK -to gen_clk_buffs[2].inst_clkbuff|inst_clock_buffer|stratix10_clkctrl_0|outclk
set_instance_assignment -name GLOBAL_SIGNAL GLOBAL_CLOCK -to gen_clk_buffs[3].inst_clkbuff|inst_clock_buffer|stratix10_clkctrl_0|outclk
The clock distribution is
input reference clock -> clock buffer (root gated clock) -> destination
I didn't add PLL yet. The reason is that it is easier to change the clock frequency using BTS on the Devkit, so that we could have a idea what core frequency we could run for our core resources. The four reference clock I am using is from the transceiver dedicated clock pins, which is recommended by the hardware designer.
You mentioned if I use the clock control IP (I set the clock control IP as a root gate), it will automatically use the global route. Should I see the information from the report of "global & other fast signal details"?
It is very interesting that I can see the high fanouts at the output of the four clock buffers from the report `plan stage -> control signals`, but I cannot see them from the report `place stage -> non global high fan out signals`, but I did see the four clock buffers from the chip planner. Since one of the output of clock buffer has been promoted to be global clock, I expect to see the rest three buffered clock in the `non global high fan out signals`. But they didn't show there, either in the "global & other fast signal details" report. I am wondering how could I know they are global or not?
My target is try to improve the fmax. We have slices, which is to implement calculation functions, connected to its corresponding transceiver. There will be 80 slices. For a single slice, it has fmax = 662MHz, and the fmax becomes 441MHz for 80 slices without any further optimization. This results is achieved when we use single clock for the entire FPGA. For the single clock, a root gate is used as well. For 80 slices, we found larger clock skews. So we try to add more clocks for the core logics to decrease the area of each clock region, so that we could improve the clock skew to improve the fmax.
Please correct me if I have some wrong concept, and let me know if you have any suggestions. I also would like to know if we are on the right direction, such as adding more core clocks, to improve the fmax.
Thanks,
Xin