derived clock versus clock enable

Question

Hello,

please can anyone post good links toward this problem.

So far I have read that clock enable is the best solution if I need a divided clock. There is a communication between the 2 (synchronous) clock domains - control signals a crossing.

At the moment I don't see an big difference between the 2 approaches:

1) using a derived (divided) clock an constrain it as generated clock

2) using a clock enable and set a multicycle constrain

Both signals can be assigned as global clock - will there be different effects toward performance if they have a high fan out?

TIA

Axel

altera_forum · Answer

An advantage of a clock enable over a divided-down clock is that the clock enable does not cause clock skew.

In the Classic Timing Analyzer, the divided-down clock produces messages like these unless you have a clock setting assigned to the divided clock:

--- Quote Start ---

Warning: Found 1 node(s) in clock paths which may be acting as ripple and/or gated clocks -- node(s) analyzed as buffer(s) resulting in clock skew

Info: Detected ripple clock "div_by_2_dff" as buffer

--- Quote End ---

If you assign a clock setting to the divided clock to tell Quartus you did it on purpose, the warning goes away, but the skew issue does not go away.

The recommended choices from most to least preferred:

Do not use ripple or gated clocks. Instead use clock enables or PLLs.

If you do use a ripple or gated clock, have no synchronous data paths going to or from the derived clock domain. Use metastability synchronization registers, handshake signals, etc. to transfer data to or from that domain. Tell Quartus not to analyze timing on the cross-domain paths. Use global routing for the derived clock to minimize skew for paths within the domain.

If you do have synchronous paths to or from the derived clock domain, you probably should add some clock uncertainty. Don't assume you are OK with positive slack on the cross-domain paths if you don't have clock uncertainty.

If you have hold violations going between domains, have the Fitter try to fix them by setting "Optimize hold timing" to "All paths".

If you still have violations and the derived clock uses global routing, try nonglobal routing. This might reduce the skew for the cross-domain paths, but it will cause some skew for paths within the derived-clock domain.

altera_forum · Answer

--- Quote Start ---

2) using a clock enable and set a multicycle constrain

--- Quote End ---

what is this multicycle constrain ??

I also need to divide 16Mhz clock to 500Khz currently I made 4bit counter (count to 16) and then connected counter output to this global clock network and quartus shows this warning about clock skews and so on.

I tried to use PLL but with PLL I can lower clock frequency just to 10Mhz :(

what is the right way for clock division ??

altera_forum · Answer

@Epis:

See page 6-34 at Quartus II Version 7.1 Handbook or search for "multicycle" in this PDF.

@Brad:

Thank you for your answer.

I am still wondering about the create generated clock sdc constraint.

Am I wrong if I say this:

I have a divided clock clk_div. I constrain it correctly with TimeQuest. I provide clk_div to the fast global clock network.

1) My Desing will be faster than the clock enable version because I don't have the delay of the (very) high fanout of the clock enable signal, which can slow down the design (can it?).

2) My Design is reliable.

(When not for what do I need the generated clock constraint)

Can you suggest good literature for this problem?

Best regards,

Axel

altera_forum · Answer

--- Quote Start ---

I have a divided clock clk_div. I constrain it correctly with TimeQuest. I provide clk_div to the fast global clock network.

...

2) My Design is reliable.

Can you suggest good literature for this problem?

--- Quote End ---

Whether you have a generated clock in TimeQuest or a derived-clock setting in the Classic Timing Analyzer assigned to the divided-down clock, the fundamental clock skew issue still exists. I don't know of literature on the subject. I just know you need to be careful when timing analysis is comparing a clock skew to a data path delay.

Global routing is thoroughly characterized, so the timing analysis is very accurate for the small skew within a clock domain on a global. But there are uncertainties in the timing when the clock skew is from something other than global routing. The timing analysis is using all numbers at the slow PVT extreme or all numbers at the fast PVT extreme (depending on whether you choose the slow or fast model for timing analysis), but the numbers probably are not all at the extreme for a given path at your particular process, voltage, and temperature combination. The timing analysis has to compare the clock-path delay to the source register, the clock-path delay to the destination register, and the data-path delay between registers. The clock skew is the difference between clock-path delays. Will the clock skew be a little faster compared to data delay than the extreme numbers say? Will it be a little slower? With today's timing models for FPGAs (not just Altera FPGAs), you don't know. That's beyond the scope of slow-model and fast-model analysis.

Quartus provides clock setup uncertainty and clock hold uncertainty settings for either analyzer. Cross-domain clock skew between a divided-down clock (even if on a global) for one register and another clock for the other register is one of the cases where I recommend adding some uncertainty, but I can't tell you how much. Most people don't bother. Most people don't think about it in the first place. Some people assume guard bands in the timing analysis take care of it, but I don't like that argument because those guard bands are meant to cover other uncertainties--they weren't necessarily intended to cover this one. To be proper you should make an allowance for the skew-vs.-data-delay uncertainty yourself.

It's better to avoid a divided-down clock in the first place unless you make it global and have no synchronous cross-domain paths so that this uncertainty isn't an issue.

--- Quote Start ---

1) My Desing [with a divided clock] will be faster than the clock enable version because I don't have the delay of the (very) high fanout of the clock enable signal, which can slow down the design (can it?).

--- Quote End ---

No matter what the n value for a divide by n, the clock enable paths have to operate in a single clock cycle. As I've said at http://www.alteraforum.com/forum/showthread.php?p=2255#post2255, nonglobal routing might be better than global even for a high-fan-out clock enable because there is a big delay associated with the global buffer. Try both global and nonglobal to see which gives better slack.

The clock enable might not be as high a fan-out on a single signal as you would expect from the RTL. Synthesis tools tend to include other logic in the clock enable in addition to what is directly implied by the HDL "if" statement for the RTL clock enable. That's why you often see a large number of clock enable signals in the "Control Signals" table in the Fitter compilation report.

If you do have a timing problem from the fan-out on a clock enable, replicate the source of the clock enable. There are multiple ways to do this ranging from letting the tools do a brute-force replication without regard to where the clock enable destinations need to be placed to a manual replication in the RTL that intentionally groups the destinations according to where they will be placed on the device.

altera_forum · Answer

I have a similar problem. I apologize in advance if the answer is in a previous post, but I'm kinda new at this game so I might not have recognized it. Here's my problem:

-I'm using Quartus 7.1 with Classic Timing Analyzer for a Stratix II design.

-I have a 150 MHz. clock which is divided by a simple f/18 state machine yielding an 8.3 MHz. derived clock

-I have a collection of logic being clocked by the 8.3 MHz. derived clock, with no synchronous data paths into or out.

-The timing analyzer is reporting setup violations based on the assumption that the available setup time is 6.667 nsec., while in fact the allowable setup time withing this slow clock domain is 18 times that.

-Based on the previous posts, I would be wise to eliminate the derived clock, but assume I can't and must deal with it.

I'm thinking I can fix this with a simple multicycle assignment, but I can't for the life of me figure out how. Can somebody please suggest the proper form of the multicycle assignment that will fix this problem?

Thanks,

Bruce

Forum Discussion

derived clock versus clock enable

10 Replies

Recent Discussions

Regarding Power-Up Sequence for Agilex 5

Cyclone V SoC 5CSXC6 Series GXB Utilization and Limitations

How to tell Quartus my Arria10 target system CLKUSR frequency is 100MHz?

Agilex 3 PLL in Source Synchronous mode ?

writing a word to cfm1 using on chip flash ip on max10