Altera_Forum
Honored Contributor
15 years agoClock Skew doubt
Hello,
Clock period = (register to register delay) - (clock skew) + Micro Tco + Micro Tsu Why are we subtracting "clock skew" Thanks, AAHello,
Clock period = (register to register delay) - (clock skew) + Micro Tco + Micro Tsu Why are we subtracting "clock skew" Thanks, AAThat is for minimum clock period (fmax then = 1/that).
You can imagine this formula if you draw two clock edges and imagine two registers receiving them: -----|------|------ .........1..2......... 1 is micro tco, 2 is micro tsu. The tco is due to launch register. clock & data then go for latch register which has tsu(not to be touched by transition). if data/clock arrive with zero difference then: minimum period would be 1 + 2 only if data is late (normally that is the case inside fpga) then add difference since data will not quickly hit tsu window. if clock is late(not favoured) then subtract difference. Though this may seem to increase fmax but causes hold violations.The clock should arrive late right? because if it arrives early then its going to take a wrong data.
Thanks, AAYou need to think of clock&data as two partners going hand in hand from launch register towards latch register. Ideally they should go at same speed through same natural delay. In practice one may lag behind.
FPGA vendors stress in their routing technology to make clock never late and additionally advice users not to gate it. So data is slightly delayed more than clock. We are talking about delay values too small relative to clock period so data reading is not a problem but we are looking at register timing window scenarios.I am reading about Timing analysis through the documents provided by the Altera Website. I have a very hard time visualizing all the details. How did you learn about it? can you send me some of the important resources you have collected.
Thanks.Hi,
The basic concepts are not difficult. What makes it difficult is the documentation and chunks of numbers and formulae. Unfortunately many posts that pop up here add to the confusion. I have never seen any easy document but have collected plenty of notes, however I don't want to add further personal docs to the forum and mess it up. It will help you to start step by step and I don't mind going through that on the forum directly, posting day to day or week to week as convenient. The first step would be to know the meaning of timing concepts... so do you have a clear idea about it or them?Hi,
Thanks. I want to master timing Analysis. Here's what I know: 1. At a latch register you need the data to be present at the register's input certain time units before the latch edge [Setup time of the register- T(setup)] and the data has to be stable at the input certain time units after the latch edge [Hold time of the register- T(hold)]. When these conditions are met the data will be latched and mata-stable condition at the output can be prevented. 2. Apart from T(setup) and T(hold) there is T(co) [the time units elapsed after the latch edge for the data to move from the registers input to the registers output]. 3. In some documents any thing specific to Register is prefixed with the word "Micro". In my case Micro T(hold), Micro T(setup), Micro T(co). 4. Since registers are placed apart the time taken by the clock to arrive at the clock input at the farther register [say register b] will be greater than the time taken by the clock to arrive at register [say register a] present before the register b. This is called as clock skew. 5. Above four points are used to calculate "Internal Fmax of the system". 6. To calculate the external Fmax of the system I need to consider the "T(port -> input pin of the register)" and "T (output pin -> port)" What is confusing me? 1. I dont understand the + and - associated with the timing parameters in the equations. Some times I feel they should have been + instead of -. 2. I dont understand how the setup and hold relation between the clocks is found. 3. Difficulty in understand what clock exception I have to use "set_false_path" etc. in my timing constraints. Thanks!! AAHi AA,
Much of your thoughts are correct. I will first put my overview in a nutshell. Later we can proceed further or discuss conflicts. timing violations concepts Whether your tool is classic timing tool or TimeQuest, the principles are naturally the same. The only difference is that TimeQuest added features applicable to very high speed issues. classic requirement: tSU/tH requirement i.e. data should be stable at the timing window at D input of a clocked register. There is conceptually no difference at all between tSU and tH except for their relation to register clock edge. However, due to this very opposite orientation, they seem like two separate parameters. high speed requirements: Even if you satisfy classic requirement you may now (with high speed devices) bump into extra restrictions. Among them is minimum pulse requirement, minimum period requirement and tH restriction (here as an extra restriction relative to classic, needs further explanation) related concepts: Apart from above, some other concepts (not necessarily requirements per se) pop up regularly e.g tCO, bus skew, clock skew timing error, sampling error?? tCO is just a parameter representing time from clock edge to Q output stabilising. It is obviously not a requirement but is an important parameter considered when managing tSU/tH requirement at latch edge. Bus skew is a natural event. When we view a bus of multiple bits then it is unlikely that all bits will change together. Each bit will have its own register somewhere. As such it may not matter but eventually the sample value will lead to error even though timing may be met per bit e.g. if read as address to memory. Clock delay is also natural. Clock skew is used to mean difference of clock delays relative to registers. But in timing analysis, it is also loosely used to mean clock delayed more than data to a given register. Sampling error is rarely used in the context of timing. It is used by receiver designers to indicate an ADC sampling point hitting the wrong time (e.g. missing peaks). But I find it useful in FPGAs as well especially across clock domains when I believe sampling error is more to blame than actual tSU/tH violation. rtl chain: An RTL chain of registers is the backbone of FPGA design. Logic work is handed from one stage to another with care at each edge. From timing perspective, this chain is “broken open” at first registers and last registers. If the amount of logic cloud between a pair is too much it may be necessary to dilute it. At the extreme there might be zero cloud there. The RTL chain provides a neat framework to control timing requirements. If one pair is controlled then all chain may be controlled. Internally, the launch member of a pair provides information about the Q transition which is data to latch member. If such information is not available then it is impossible to control timing. For example, a tool will not report on first registers unless you provide information on external register (and board) that will join the chain. Similarly, the tool will not report as what will happen at external latch register unless you provide the required information. timing window shift: By far, this is the most important issue. At the mouth of a register, the requiremet of tSU/tH is intrinsic, tSU is always +ve and in front of clock edge, tH is always +ve and behind clock edge. As such it is termed micro tSU/tH or reg tSU/tH. But viewed at preceding point (launch register or input pins) then this window may slide dramatically across clock edge. This matters to the designers more than micro position because they deal with pins and not the internal micro (external devices also provide tSU/tH at the pins and not internal registers). The tool deals with the internal micro. The formula for the sliding window in source synchrounous systems is this: tSU (at pin) = reg tSU + (data delay – clk delay) tH (at pin) = reg tH – (data delay - clk delay) The same equations apply internally to any launching register perspective of next latching register tSU/tH . The delays of clock/data could make either tSU/tH zero or negative and may result in their visual lengthening/overlapping if the sampling window slides past the clock edge. However, the actual length of sampling window stays constant as it is the sum of signed tSU + signed tH.I truly appreciate the effort you put into your previous post. I understand 80% of what you are talking about. But I am not able to visualize it. I do have lot of doubts. I think the best way to understand this would be to do extensive reading and then ask the right questions. I will post all my timing related questions in this post.
Thanks again, AAHi AA,
Thanks for your words. I will now add part II so you may focus on further points: Now, how does a timing tool control tSU/tH inside fpga (ignoring io registers i.e. first fpga register and last external register in chain): Note: to visualise properly this discussion, you need to imagine two registers i.e. launch/latch registers as two successive nodes in space. At the same time you need to visualise two clock edges in time domain at each register that relate by finite delay. 1) tSU violation is avoided by restricting clock period to a minimum such that the data transition (tCO) of launch almost hits tSU of latch. In other words: Fmax = lowest of 1/[reg tCO + reg tSU + (data delay – clk delay)] across all launch/latch pairs. This applies to [edge to next edge] setup relationship. i.e the setup relationship is between current launch edge and following latch edge. It may differ in various systems e.g. it could be with opposite edge relationship(rise-fall) or have multicycles. 2)tH violation is avoided in silicon fabrication stage by making sure clock is never delayed more than data (global lines being fast). Except for very fast clocks, fmax has nothing to do with tH because fmax is based on data transition never hitting tSU window so how can it pass across clk edge and hit tH window, however, the tool will check tH with respect to current (not next) edge at latch register and normally tCO ensuers kicking data transition well away from tH window. Put in other words, the tH relationship is bwteen current launch edge with current latch edge. Above discussion applies to classic tSU/tH requirement. With high speed requirements, the pulse/period duration (toggle rate) may obviously have their own restrictions. Moreover, tH violation may occur now despite silicon avoidance, i.e. if clock is very fast then data transition (decided by tCO) could be too close to current edge at latch register. When fmax is restricted due to these extras then it is termed simply “restricted fmax”. Apart from the chain structure of launch/latch pairs, there are cases of feedback e.g. an accumulator where launch/latch is to the same one register. Here, the same rules apply as if they are a pair. To improve fmax, one needs to add more registers(pipe) to breakup the cloud of logic in between a pair or in a feedback path. Alternatively, one can add more copies of same register if you want to avoid the pain of functionality balancing caused by extra pipe. In equation terms this helps reduce data delay factor. To avoid tH violation, do not gate the clock. If it is localised then you may try delay the data involved. For specific sections you may also consider clock phase rotation (using PLL) but this is more commonly done at io.Hello Kaz,
Here are some of my doubts; -----------------Question 1-------------------- Lets say there are two registers [register_a and register_b]. Data goes from register_a to register_b. [in space register_a comes before register_b]. Let clock_a and clock_b be their respective clocks. Now the data is handled by two different clocks and therefore it is passing through different clock domains. case 1: clock_a is same as clock_b Here the Time Quest Analyzer automatically considers the setup to by one cycle [lets say this is 10ns] and then tries to meet the timing requirement by positioning the position of register_b so that the setup time and hold time are not violated. case 2: if they are different clocks Here the Time Quest Analyzer considers the setup time for the path to be the time units between the two consecutive rising edges[ lets say this time between rising edge of clock_a and clock_b is 2ns]. So, in this case the setup time is considered as 2ns and the maximum delay the data can have before reaching the input of the destination register is 2ns. lets assume this cannot be met during which I can use: 1. set_false_path 2. set_clocks_group 3. set_multicycle [if the rising edges between the two clocks coincide] my questions: a) In the above para I described the reason why I would cut the analysis of the unrelated clock domain. Is it the right reason to cut it ? :) When I decide to cut the unrelated clock domains, is it just to get less timing errors in the Time Quest Analyzer[TQA] and reduce the stain on the fitter? b) By using option 1 and 2, I am telling the TQA not to analyze that path. Though it is clocked by two different clocks, some one has to make sure that the rising edge of clock_b doesn't appear before/after the latching window at register_b. Who takes care of this? c) If this is the right way to use set_false_path then, even before compilation I will set false path between all the clock domains that have a different frequencies. Is it the right thing to do?. -----------------Question 2-------------------- In the flash presentation "Switching to Time Quest Timing Analysis" [This is an Altera Legacy training] in slide 16: a) Since Pll clock is delayed by 2ns shouldn't the 1st rising edge of "pll_clk" appear 2ns after 1st rising edge of "clock"? Its the other way around in the video!! b) Why do they add +2 while calculating the slack? Thanks, AA