Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
9 years ago

set_input_delay constraint

Hi,

I have a bidirectional synchronous interface which I want to constrain. Without any constraints the interface doesn't work and with the constraints I tried it works, but I get the message from Quartus that the timing requirements aren't met. I have probmles constraining the input signals coming from the external chip. The chip works with a 100MHz clock (PCLK) which comes from the FPGA and is inverted to the FPGA clock (FPGA_CLK) generating the control and data signals. The inversion is done with an ALTDDIO register. I want to sample the control flags on the rising edge of FPGA_CLK. Everything I read to the set_input_delay constrain is based on the assumption that I have a clock coming from the external device aligned with the data. But in my case I only have the clocks generated inside the FPGA, so I don't know if I need a virtual clock here too. I tried it with and without virtual clock and in both cases I get the timing requirements not met warning.

Attached you can see a capture of the control signals of the interface. PCLK comes from the FPGA and goes into the external device. Now I want to constrain the FLAGD signal which needs to be sampled by the FPGA. It comes with a maximum delay of 8ns (7ns in the attachment) after the rising edge of PCLK and needs to be sampled with the next rising edge of FPGA_CLK. My sdc file looks currently like this (shortened, output port included):

create_clock -period 20 
derive_pll_clocks
create_generated_clock -name out_clk -source |muxsel}] -divide_by 1 -multiply_by 1 -invert 
derive_clock_uncertainty
set_false_path -to 
# setup and hold for data send to the external interface
set tsu_fx3       2
set th_fx3        0.5
# board delays. I don't have exact values.
set tbd_data_min  0.65
set tbd_data_max  1.26
set tbd_clk_min   0.72
set tbd_clk_max   1.15
set out_max_delay 
set out_min_delay 
set_output_delay -max $out_max_delay -clock out_clk 
set_output_delay -min $out_min_delay -clock out_clk 
set tco_fx3_min   0
set tco_fx3_max   8 
set in_max_delay 
set in_min_delay 
set_input_delay -clock {out_clk} -max $in_max_delay 
set_input_delay -clock {out_clk} -min $in_min_delay 
set_multicycle_path -setup -to  2

Is this a good basis or are my constraints completely wrong? What is correct value for tco_min? It is not mentioned in the datasheet. Which clock I need as the reference for my set_input_delay constrain?

Regards

8 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I'm just looking at the FLAGD input. At 100MHz, with a launch on falling edge and multicycle setup of 2, you should have a setup relationship of 15ns and hold relationship of 5ns. This makes sense as the difference is 10ns, which is your data rate. So right off the bat, your external max is 9.26ns and your min is 0.65ns, for a difference of 8.61ns. So 8.61ns of your 10ns window is already chewed up before the FPGA delays are even taken into account.

    I can say right now that the max/min spread in the FPGA will be greater than 1.39ns(I'm 98% sure). I just don't think this interface can work with those specs.

    The first thing I would check is the tco_fx3_min of 0ns. If the other device has a PLL it's possible to get a Tco close to 0, but then the max wouldn't be 8ns. I'm guessing it doesn't have a PLL, the 8ns max is correct, and the 0ns min is not real. I'd be curious what it is if you measure it. But when an external device chews up 8ns of the data window, it really can't run at 100MHz.

    (If the min increased to 4ns, for example, it still might be tough to meet timing, but closer to possible. You might have to do some tricks like have the latch and launch clocks on different outputs of the PLL and phase shift one or the other. Hard to say without knowing the specs and having something to compile).

    I did that analysis quickly, so let me know if I missed anything. Good luck.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Thank you for the reply! Yes, you're right. The tco_max of 8ns is correct and tco_min with 0 is wrong. I set the tco_min now to 6ns and added the clock delays to the min and max values. But the timing still fails.

    Here are my settings:

    
    set tbd_data_min  0.65
    set tbd_data_max  1.26
    set tbd_clk_min   0.72
    set tbd_clk_max   1.15
    set tco_fx3_min   6
    set tco_fx3_max   8 
    set in_max_delay 
    set in_min_delay 
    set_input_delay -clock {out_clk} -max $in_max_delay 
    set_input_delay -clock {out_clk} -min $in_min_delay 
    set_multicycle_path -setup -from  2
    

    With these settings the FLAG is sampled correctly in the FPGA but the timing report says the timing requirements aren't met. The data path from the FLAGD pin to the register is reported to be 2.383ns. And the timing report shows a slack of -1.024. The interface is the synchronous slave fifo of an FX3. The specs are in http://www.cypress.com/file/136056/download (http://www.cypress.com/file/136056/download) starting on page 6.

    Are these values already relative to my PCLK clock output pin? I also added "-reference_pin [get_ports PCLK]" but got the warning "Reference pin PCLK is invalid. It is not clocked by the clock specified in set_input_delay/set_output_delay's -clock option".

    Do you have an idea why it still fails?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I assume tCO of 6~8 ns is from external chip data sheet and must be relative to the launch edge at external chip clock pin. What you are telling Timequest is that the 6~8 ns is relative to fpga out_clk. This possibly is the source of conflict.

    The FlagD is sampled by an internal fpga clock and timequest relates that to fpga out_clk, hence it sees data arriving (may be) too early if for example the latch clock at FlagD register is delayed inside fpga by say 4ns relative to clk_out.

    so you need either to modify tCO relative to fpga clk_out or create virtual clock delayed from clk_out then use it as reference for set_input_delay.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    You may just be failing setup altogether. You've got a 15ns round-trip requirement, with ~8ns chewed up externally, leaving 7ns for the FPGA. I would have thought it could make this, but hard to say without the design.

    - Note that the fitter tries to meet both setup and hold, so if it can't meet either, it balances the two, i.e. it might be adding delay to get better slack on failing hold, although the setup fails too(e.g. it's better to have hold fail by -2ns and setup by -2ns then to have hold fail by -4ns and setup meet timing.) I doubt this is happening in your case since your hold requirement is 5ns and your min external delay is 6ns, so you've automatically met hold before adding in any FPGA delays. Now that I write it, this should definitely not be a problem. (If I thought it were, I would a) look for hold failures on this path in the fast timing corners, or b) comment out the min delay, so the min is the same as the max, and see if it still fails setup). Again, I don't think this is a problem.

    - I don't think you feel comfortable with the constraints. Run the following in TimeQuest:

    report_timing -setup -npaths 10 -detail full_path -from [get_ports FLAGD] -panel_name "FLAGD setup" -file "./TQ/FLAGD_setup.txt"

    report_timing -hold -npaths 10 -detail full_path -from [get_ports FLAGD] -panel_name "FLAGD hold" -file "./TQ/FLAGD_hold.txt"

    The setup is the one to be concerned about, but always good to look at hold. Anyway, look at the Data Path tab. The Data Arrival Path should show the base clock coming into the FPGA, going to the output clock PCLK, then wrapping back into the FLAGD port to the input register. There should be an iExt delay of 8ns(plus brd_dly). Basically this is the clock coming into the FPGA and creating data that wraps around externally to the latch register. The Data Required Path should then be that same clock coming into the FPGA and going to the latch register's .CLK pin. Once you understand that, you can see how the delays are calculated and add up. It may be that it just doesn't meet timing. If you want, attach the FLAGD_setup.txt to this post.

    If it just can't meet setup, two thoughts:

    - The quick one is to make it launch on the rising edge rather than falling. You would need to change the logic to accomodate this, but you could just do a test run in the .sdc by removing the -invert in the generated clock and compiling. This will make your hold relationship 10ns and setup 20ns. This buys 5ns of delay for setup and it should be able to meet timing. My only concern is now the hold is not a slam dunk, i.e. the relationship is 10ns, the external delay is ~7ns, so the FPGA needs to add at least 3ns. This probably occurs automatically by what it routes through, but if it doesn't the fitter will need to add delay, which makes setup a little tighter.

    - Another option if you can't thread this needle is to use a PLL and have separate clocks drive the launch and latch edges. This way you can dial the latch clock exactly where you want it, i.e. it could have a +7ns shift or something like that(with edges, you have to be a multiple of 5ns).
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    It seems that the data delay is too long with the clock and input delay and it works because the delay is in fact not as long as I told Quartus.

    I think I that I have to redesign the whole state machine including clocks. After I combined my two FSMs for reading and writing with a FIFO to a loopback the reported FMAX is lower than my clock and output delays of some data pins can not be met.

    Before I start I have some questions:

    The interface has a maximum tCO of 7 and 8ns (data/flags). And a tCDH (clock to data hold) of 2ns. At the interface clock of 100MHz I only have a data valid window of 5ns around the clock. Currently I have the fpga_clock and the inverted fpga_clock as PCLK and no output registers. Then my setup relationship between launch and latch clock is for outgoing data is 5ns, right? Would it be better to add a register to every signal going to the interface? This way I would have a looser setup relationship?

    How can I constrain that I only have a data valid window of 5ns?

    If I have a clk100 generating the data and control signals for writing to the interface and a clk100_shifted clock to meet setup times on the interface side and another phase shifted clock for clocking in the data read from the interface. Couldn't I get problems with setup times between the clk100 and the clock latching the the FLAGS/data from the interface because I need to read the signals in the clk100 domain.

    Thank you for your detailed answers!
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Up until now, I haven't looked at the set_output_delay constraints at all. It sounds like you're source-synchronously writing to the device. Anyway yes, registering your outputs should give a faster, more consistent Tco time. You should have a setup relationship of 5ns and hold of -5ns. Looking at your constraints, I think th_fx3 should be negative when calculating the min value:

    set out_min_delay [expr $tbd_data_min - $th_fx3 - $tbd_clk_max]

    (By making the external delay negative, the FPGA delay must be larger to counteract it. For example, if the hold relatinoship is -5ns and external delay is -2ns, then the FPGA delay can't be shorter than -3ns or it will fail timing. If you have the hold value as positive, then with a -5ns hold relatinoship and +2ns external delay, the FPGA delay can't be less than -7ns to fail timing).

    For the roundtrip that stinks that it's not meeting timing. Can you try removing the -invert and see if that works?

    I would avoid making multiple clock phases until you try everything else, but you probably have few paths going between these different domains and Quartus is pretty good at meeting these requirements internally. You might have to add a multicycle though. For example, if you PLL_int_clk0 with a regular 10ns clock, and PLL_int_clk1 as 10ns with a 1ns shift. Any transfers from clk0 to clk1 will have a default 1ns setup relationship, which might be impossible(a reg to reg transfer might make that), in which case you might have to add a multicycle setup 2 between those clocks. It should work, just another thing to deal with.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Without the invert I get no erros for the input ports. And the failing paths for the outputs there are less errors. Is quartus able to meet setup times if I register all my outputs without the need of a shifted clock?

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I don't know how it's failing and what you have now(I assume the outputs are registers, they just have combinatorial logic before they go to the pin). If so, registering them would make them have a faster output time. Of course, the fitter can use IO delay chains to slow down the output time too, if it deems it necessary for hold. It is standard practice to register your data outputs, but I don't know the specifics of your situation well enough to say if it will help. It will probably help, and should not hurt.