I'm a ASIC/SoC prototyping user. I get an ASIC design which include many clock gating by and gate, or gate, and integrated clock gating cell (latch base). From the following page, I know that clock gating by and gate or or gate can use "Auto gated clock convert to clock enable" feature to solve it. Quartus Prime Help version 15.1 - Gated Clock Should be Implemented According to Altera Standard Scheme (Design Assistant Rule) (intel.com) How about integrated clock gating cell(ICG)? I only know that latch will use many resources to implement it, and global clocks aren't too much. And integrated clock gating cell(ICG) may cause some clock skew? Unfortunately, there are many and gate clock gating, or gate clock gating, and integrated clock gating cell in the ASIC design. Does someone can please tell me whether have a better way to deal with those clock gating cells? Thank you very much.

Sorry that correct the typo. How about integrated clock gating cell(ICG)? I only know that latch will use many resources to implement it, and global clocks aren't too much. I only know that FPGA will use many resources to implement a latch. Global clocks aren't too much to support all clocks. And integrated clock gating cell(ICG) may cause some clock skew? Unfortunately, there are many "and gate" clock gating, "or gate" clock gating, and "integrated clock gating" cell in the ASIC design. Does someone can please tell me whether have a better way to deal with those clock gating cells? Thank you very much.

Could someone please tell me some keyword or some information? Thank you very, very, very, so much.

You should just use the ALTCLKCTRL IP.(user guide here: https://www.intel.cn/content/dam/support/us/en/programmable/kdb/pdfs/literature/ug/ug-altclock.pdf)This allow you to instantiate clock buffers, selecting the kind of clock routing you want to use (Global, Regional, etc) and you can instantiate into that the CLK_ENABLE through which you can gate the clock itself.Please note that this is not the case with all device families (i.e. S10 and Agilex are a bit different from previous devices).Have a look at the specific device family user guide for more info.I also see that you are looking very old documentation (15.1.. today we are in 21.1 STD or 22.2 PRO), ensure to look the one specific for the Quartus version you are targeting.i.e. for Quartus PRIME PRO you can have a look here:https://www.intel.com/content/www/us/en/docs/programmable/683082/22-2/use-gated-clocks.html (The easiest way to find Quartus Handbook is to go to this link:https://www.intel.com/content/www/us/en/support/programmable/support-resources/design-software/user-guides.html )In terms of timing if you add a clock buffer of course you will have an inherent delay due to the buffer delay itself plus the routing you need to bring your clock signal to the buffer itself. However then all FF fed by this clock shall have the same delay (so almost neglectable skew).You mention that in the design you are taking from ASIC, you have a lot of clocks signals (with I assume different gating).If this is the case consider to use non dedicated clock routing resource for clocks with low fanout. In that case you can just AND the clock with an enable signal to gate it - anyway you will not incur in any timing penalties as the local routing is connecting ALMs (of course you will add to the clock a delay due to the signal passing through an ALM that without the clock gating you will not need). For a design like this I suggest to turn off automatic clock global promotion in the tool and just add the ALTCLKCTRL IP for the clock you want to promote to global/regional/etc.

Hi Marco_Intel Thank you, Marco_Intel, very, very, much. Yes, there are more than hundreds integrated clock gating cell (ICG) in the ASIC design. I will try to replace all ICG with ALTCLKCTRL IP. But I'm not sure whether all clock resource (Global, Regional, etc.) are enough or not. And as you mention, there may can use non dedicated clock routing resource for clocks with low fanout. How to do that? Should I change all ICG to ALTCLKCTRL IP? Or, change the coding style as following example? Ex. ICG: always @(*) if (~clk) clk_en=en; assign clock_gating=clk & clk_en; | | | ˇChange a ICG to a AND gate, and then use "Auto gated clock convert to clock enable" feature:assign clock_gating=clk & clk_en; Thank you very much.

Hello NuvKFC, if you are talking of hundreds for sure the FPGA (whichever you chose from any vendor) will not have enough dedicated clock lines for all of them. I encourage you to look at the specific device family user guide. For the ones with higher fanout, use dedicated lines hfor that I suggest to use the ALTCLKCTRL IP to have the exact control you want. The choice of regional or global depending on how much the logic could be spread through the device - i.e. even if the number of fanout is not very big sometimes there are other considerations to be made in case most of that is BLOCK Memory or DSP. Note that in the chip planner you can always show the clock regions to get a better understanding on that. You can also force the placement of the logic in a specific part of the device creating a logic lock region. For the remaining you could just change the coding to implement the clock enable synchronously with the data. As all registers have the enable this will be converted by the tool appropriately to it. However as pointed out by the documentation this is not reducing the power of the clock line as the clock line is always toggling (the enable is implemented at LAB or FF level depending on the family). If your aim was to use clock gating to reduce power, for the clocks with higher fanout you shall use the ALT_CLKCTRL IP. The enable implemented there (in most of the family) tied off the clock network itself, so you will get the most from power saving. Lastly I want to point out that doing what I was suggesting require that you already have a good understanding of your design.If this is not the case you can just try to synthetize your logic and Quartus by default shall recognize clock and promote them automatically, but sometimes you want to have better control and I assume this was your case. Best regards

How to deal with integrated clock gating cell on FPGA?

13 Replies

NuvKFC
Contributor
3 years ago
Sorry that correct the typo.

How about integrated clock gating cell(ICG)?

~~I only know that latch will use many resources to implement it, and global clocks aren't too much.~~

I only know that FPGA will use many resources to implement a latch.

Global clocks aren't too much to support all clocks.

And integrated clock gating cell(ICG) may cause some clock skew?

Unfortunately, there are many "and gate" clock gating, "or gate" clock gating, and "integrated clock gating" cell in the ASIC design.

Does someone can please tell me whether have a better way to deal with those clock gating cells?

Thank you very much.
NuvKFC
Contributor
3 years ago
Could someone please tell me some keyword or some information? Thank you very, very, very, so much.
Marco_Intel
New Contributor
3 years ago
You should just use the ALTCLKCTRL IP.
(user guide here: https://www.intel.cn/content/dam/support/us/en/programmable/kdb/pdfs/literature/ug/ug-altclock.pdf)
This allow you to instantiate clock buffers, selecting the kind of clock routing you want to use (Global, Regional, etc) and you can instantiate into that the CLK_ENABLE through which you can gate the clock itself.

Please note that this is not the case with all device families (i.e. S10 and Agilex are a bit different from previous devices).
Have a look at the specific device family user guide for more info.

I also see that you are looking very old documentation (15.1.. today we are in 21.1 STD or 22.2 PRO), ensure to look the one specific for the Quartus version you are targeting.
i.e. for Quartus PRIME PRO you can have a look here:
https://www.intel.com/content/www/us/en/docs/programmable/683082/22-2/use-gated-clocks.html

(The easiest way to find Quartus Handbook is to go to this link:
https://www.intel.com/content/www/us/en/support/programmable/support-resources/design-software/user-guides.html )

In terms of timing if you add a clock buffer of course you will have an inherent delay due to the buffer delay itself plus the routing you need to bring your clock signal to the buffer itself. However then all FF fed by this clock shall have the same delay (so almost neglectable skew).

You mention that in the design you are taking from ASIC, you have a lot of clocks signals (with I assume different gating).If this is the case consider to use non dedicated clock routing resource for clocks with low fanout.

In that case you can just AND the clock with an enable signal to gate it - anyway you will not incur in any timing penalties as the local routing is connecting ALMs (of course you will add to the clock a delay due to the signal passing through an ALM that without the clock gating you will not need).

For a design like this I suggest to turn off automatic clock global promotion in the tool and just add the ALTCLKCTRL IP for the clock you want to promote to global/regional/etc.
- NuvKFC
  Contributor
  3 years ago
  Hi Marco_Intel
  
  Thank you, Marco_Intel, very, very, much.
  
  Yes, there are more than hundreds integrated clock gating cell (ICG) in the ASIC design.
  
  I will try to replace all ICG with ALTCLKCTRL IP.
  
  But I'm not sure whether all clock resource (Global, Regional, etc.) are enough or not.
  
  And as you mention, there may can use non dedicated clock routing resource for clocks with low fanout.
  
  How to do that?
  
  Should I change all ICG to ALTCLKCTRL IP?
  
  Or, change the coding style as following example?
  
  Ex.
  
  ICG:
  
  always @(*)
  
  if (~clk)
  
  clk_en=en;
  
  assign clock_gating=clk & clk_en;
  
  |
  
  |
  
  |
  
  ˇ
  Change a ICG to a AND gate, and then use "Auto gated clock convert to clock enable" feature:
  assign clock_gating=clk & clk_en;
  
  Thank you very much.
Marco_Intel
New Contributor
3 years ago
Hello NuvKFC,

if you are talking of hundreds for sure the FPGA (whichever you chose from any vendor) will not have enough dedicated clock lines for all of them. I encourage you to look at the specific device family user guide.

For the ones with higher fanout, use dedicated lines hfor that I suggest to use the ALTCLKCTRL IP to have the exact control you want.

The choice of regional or global depending on how much the logic could be spread through the device - i.e. even if the number of fanout is not very big sometimes there are other considerations to be made in case most of that is BLOCK Memory or DSP.

Note that in the chip planner you can always show the clock regions to get a better understanding on that. You can also force the placement of the logic in a specific part of the device creating a logic lock region.

For the remaining you could just change the coding to implement the clock enable synchronously with the data.

As all registers have the enable this will be converted by the tool appropriately to it.

However as pointed out by the documentation this is not reducing the power of the clock line as the clock line is always toggling (the enable is implemented at LAB or FF level depending on the family).

If your aim was to use clock gating to reduce power, for the clocks with higher fanout you shall use the ALT_CLKCTRL IP.

The enable implemented there (in most of the family) tied off the clock network itself, so you will get the most from power saving.

Lastly I want to point out that doing what I was suggesting require that you already have a good understanding of your design.
If this is not the case you can just try to synthetize your logic and Quartus by default shall recognize clock and promote them automatically, but sometimes you want to have better control and I assume this was your case.

Best regards
- NuvKFC
  Contributor
  3 years ago
  Hi Marco_Intel
  
  Thank you, Marco_Intel, very, very, much.
  
  Almost all ICG cells are used to reduce the power on the clock network and disable some non-working module.
  
  To match the same behavior, I will change ICGs which have high fanout to the ALT_CLKCTRL IP.
  
  The rest ICG, "or gate" gating, and "and gate" gating, I will first directly synthetize my design to make Quartus promote them automatically.
  
  Then enhance performance step by step.
  
  Thank you very, very, very, much.
SyafieqS
Super Contributor
3 years ago
Hi KF Cc,

May I know if there is any other concern regarding this?
- NuvKFC
  Contributor
  3 years ago
  Hi SyafieqS
  
  Thank you, SyafieqS, very, very, much.
  
  For speed, I remove all clock divider and most clock gating cell first, which include "and gate" gating, "or gate" gating, and "ICG".
  
  All clocks are directly connected to the source clock.
  
  Do have any suggestions or suggested steps for migrating those clock gating cells and clock dividers into the FPGA from the ASIC?
  
  I afraid of failing on some complex step, and it may be hard to recover because there is too much clock gating cells.
  
  Then, I need to try again.
  
  For example, some ICG with low fanout don't be changed. It's by my guessing.
  
  Thank you very, very, much.
SyafieqS
Super Contributor
3 years ago
Hi Kf Cc,

Regarding the remaining clock gate with low fanout, you can refer to other recommendation of clock-gating method.

https://www.intel.com/content/www/us/en/docs/programmable/683082/22-2/recommended-clock-gating-methods.html
For clock divider, you may refer to Basic Clock Divider Using -divide_by

[1] https://www.intel.com/content/www/us/en/docs/programmable/683081/22-2/basic-clock-divider-using-divide-by.html

[2] https://www.intel.com/content/www/us/en/docs/programmable/683243/21-3/clock-divider-example-divide-by.html
- NuvKFC
  Contributor
  3 years ago
  Hi SyafieqS
  
  Thank you, SyafieqS, very much. I will study that.
SyafieqS
Super Contributor
3 years ago
We do not receive any response from you to the previous reply that I have provided, thus I will put this case to close pending. Please post a response in the next 15 days to allow me to continue to support you. After 15 days, this thread will be transitioned to community support. The community users will be able to help you with your follow-up questions.

p/s: If any answer from community or Intel support are helpful, please feel free to mark as solution and give Kudos.
- NuvKFC
  Contributor
  3 years ago
  Hi SyafieqS
  
  Should I response what?
  - NuvKFC
    Contributor
    3 years ago
    I'm trying to transplant SoC's complex clock network to onto the FPGA.
    
    But I can't find any methodology to teach people how to do that.
    
    I need to trial and error.

Forum Discussion

How to deal with integrated clock gating cell on FPGA?

13 Replies

Recent Discussions

Timing analysis - long combinational path

The quartus license works with version 25.0 but not with version 17.0

Using Reset Release IP (Agilex, Stratix) without IP catalog via simple instantiation is ok?

Power Analyzer for Cyclone 10 GX

Reset Release IP for Agilex needs Stratix 10 device files installed!