Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
16 years ago

Reduce Clock Setup and Clock Hold times

Hi there,

I have a state machine witch makes some math operations. The question is how can I reduce the clock setup and hold times, so I can meet my timing requirements? Here is the code of the state machine:

module math_synthesis (
        input clk20M,
        input  a, b,
        output reg  result_out
);
//Math registers
reg    dataa_mult, datab_mult,
                 dataa_add, datab_add,
                 dataa_sub, datab_sub,
                 denom_sig, numer_sig;
wire     quotient_sig, remain_sig,
            result_add, result_sub;
wire  result_mult;
//State machine registers:
reg  currentState;
//Registers initial:
initial begin
        currentState = 4'd0;
end
//States:
parameter  
    STATE_1 = 4'd0,        STATE_2 = 4'd1,
    STATE_3 = 4'd2,        STATE_4 = 4'd3,
    STATE_5 = 4'd4,        STATE_6 = 4'd5,
    STATE_7 = 4'd6,        STATE_8 = 4'd7;
always @(posedge clk20M) begin
    case(currentState)
        STATE_1: begin
                dataa_mult <= a;
                datab_mult <= 16'd977;
                currentState <= STATE_2;
        end
        STATE_2: begin
                numer_sig <= result_mult;
                denom_sig <= 16'd1000;
                
                currentState <= STATE_3;
        end
        STATE_3: begin
                dataa_mult <= quotient_sig;
                datab_mult <= 16'd256;
                currentState <= STATE_4;
        end
        STATE_4: begin
                dataa_sub <= result_mult;
                datab_sub <= 16'd25600;
                currentState <= STATE_5;
        end
        
        STATE_5: begin
                numer_sig <= result_sub;    
                denom_sig <= 16'd100;
                            
                currentState <= STATE_6;
        end
        STATE_6: begin
                result_out <= quotient_sig;            
                currentState <= STATE_1;
        end
    endcase
end
math_mult    math_mult_inst (
    .dataa ( dataa_mult ),
    .datab ( datab_mult ),
    .result ( result_mult )
    );
math_divide    math_divide_inst (
    .denom ( denom_sig ),
    .numer ( numer_sig ),
    .quotient ( quotient_sig ),
    .remain ( remain_sig )
    );
    
math_add    math_add_inst (
    .dataa ( dataa_add ),
    .datab ( datab_add ),
    .result ( result_add )
    );
math_sub    math_sub_inst (
    .dataa ( dataa_sub ),
    .datab ( datab_sub ),
    .result ( result_sub )
    );
    
endmodule

I attached the clock setup and hold times.

If there is any technique witch I could use to improve timings, please share it :)

9 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Hi there,

    I have a state machine witch makes some math operations. The question is how can I reduce the clock setup and hold times, so I can meet my timing requirements? Here is the code of the state machine:

    module math_synthesis (
            input clk20M,
            input  a, b,
            output reg  result_out
    );
    //Math registers
    reg    dataa_mult, datab_mult,
                     dataa_add, datab_add,
                     dataa_sub, datab_sub,
                     denom_sig, numer_sig;
    wire     quotient_sig, remain_sig,
                result_add, result_sub;
    wire  result_mult;
    //State machine registers:
    reg  currentState;
    //Registers initial:
    initial begin
            currentState = 4'd0;
    end
    //States:
    parameter  
        STATE_1 = 4'd0,        STATE_2 = 4'd1,
        STATE_3 = 4'd2,        STATE_4 = 4'd3,
        STATE_5 = 4'd4,        STATE_6 = 4'd5,
        STATE_7 = 4'd6,        STATE_8 = 4'd7;
    always @(posedge clk20M) begin
        case(currentState)
            STATE_1: begin
                    dataa_mult <= a;
                    datab_mult <= 16'd977;
                    currentState <= STATE_2;
            end
            STATE_2: begin
                    numer_sig <= result_mult;
                    denom_sig <= 16'd1000;
                    
                    currentState <= STATE_3;
            end
            STATE_3: begin
                    dataa_mult <= quotient_sig;
                    datab_mult <= 16'd256;
                    currentState <= STATE_4;
            end
            STATE_4: begin
                    dataa_sub <= result_mult;
                    datab_sub <= 16'd25600;
                    currentState <= STATE_5;
            end
            
            STATE_5: begin
                    numer_sig <= result_sub;    
                    denom_sig <= 16'd100;
                                
                    currentState <= STATE_6;
            end
            STATE_6: begin
                    result_out <= quotient_sig;            
                    currentState <= STATE_1;
            end
        endcase
    end
    math_mult    math_mult_inst (
        .dataa ( dataa_mult ),
        .datab ( datab_mult ),
        .result ( result_mult )
        );
    math_divide    math_divide_inst (
        .denom ( denom_sig ),
        .numer ( numer_sig ),
        .quotient ( quotient_sig ),
        .remain ( remain_sig )
        );
        
    math_add    math_add_inst (
        .dataa ( dataa_add ),
        .datab ( datab_add ),
        .result ( result_add )
        );
    math_sub    math_sub_inst (
        .dataa ( dataa_sub ),
        .datab ( datab_sub ),
        .result ( result_sub )
        );
        
    endmodule
    

    I attached the clock setup and hold times.

    If there is any technique witch I could use to improve timings, please share it :)

    --- Quote End ---

    Hi,

    what is your required clock speed ? How do you generate the clock (PLL ?).

    Kind regards

    GPK
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi,

    My clock speed is 20MHz but if I want to do more complicated calculations or change the signals width it doesn't fit. It's generated by external oscillator.

    Best regards,

    VT
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Hi,

    My clock speed is 20MHz but if I want to do more complicated calculations or change the signals width it doesn't fit. It's generated by external oscillator.

    Best regards,

    VT

    --- Quote End ---

    Hi VT,

    did you write your arithmetic functions by yourself ?

    Kind regards

    GPK
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Hi VT,

    did you write your arithmetic functions by yourself ?

    Kind regards

    GPK

    --- Quote End ---

    The math functions are asynchronous, generated by Altera's MegaWizard Plug-In.

    By the way what does the following warning mean:

        Warning: Synthesized away the following LCELL buffer node(s):
            Warning (14320): Synthesized away node "math_mult:math_mult_inst|lpm_mult:lpm_mult_component|mult_k3n:auto_generated|le10a"
            Warning (14320): Synthesized away node "math_mult:math_mult_inst|lpm_mult:lpm_mult_component|mult_k3n:auto_generated|le10a"
    ...
    
    Thanks,

    VT
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    The math functions are asynchronous, generated by Altera's MegaWizard Plug-In.

    By the way what does the following warning mean:

        Warning: Synthesized away the following LCELL buffer node(s):
            Warning (14320): Synthesized away node "math_mult:math_mult_inst|lpm_mult:lpm_mult_component|mult_k3n:auto_generated|le10a"
            Warning (14320): Synthesized away node "math_mult:math_mult_inst|lpm_mult:lpm_mult_component|mult_k3n:auto_generated|le10a"
    ...
    
    Thanks,

    VT

    --- Quote End ---

    Hi VT,

    the warning means that the synthesis engine found logic which could be without changing the design behaviour.

    As far as I know, all arithmetic functions supports so-called pipelining. That means that registers stage will be implemented in order to improve the timing. Of course the result will be available some clock cycles later.

    I have an example of a divider attached. Maybe it could help you.

    Kind regards

    GPK
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Hi VT,

    the warning means that the synthesis engine found logic which could be without changing the design behaviour.

    As far as I know, all arithmetic functions supports so-called pipelining. That means that registers stage will be implemented in order to improve the timing. Of course the result will be available some clock cycles later.

    I have an example of a divider attached. Maybe it could help you.

    Kind regards

    GPK

    --- Quote End ---

    Thanks for the replay.

    I know I could improve the timing by adding some pipeline but I want to understand on what those timings depend on. May be I should read more about the lower level of the synthesis.

    Best regards,

    VT
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Thanks for the replay.

    I know I could improve the timing by adding some pipeline but I want to understand on what those timings depend on. May be I should read more about the lower level of the synthesis.

    Best regards,

    VT

    --- Quote End ---

    Hi,

    I have a small drawing attached to show you how retiming works .

    Kind regards

    GPK
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    OK, I have one question now :). Does this mean that if I make a PLL clock multiplier, for example 80MHz and if timings are OK at 20MHz they would be OK at 80MHz as well. What I mean is that the synthesis is adjusted to the clock speed and the connection timings will be proportional to 20MHz?

    Best regards,

    VT
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    OK, I have one question now :). Does this mean that if I make a PLL clock multiplier, for example 80MHz and if timings are OK at 20MHz they would be OK at 80MHz as well. What I mean is that the synthesis is adjusted to the clock speed and the connection timings will be proportional to 20MHz?

    Best regards,

    VT

    --- Quote End ---

    Hi,

    as long as you have enough register stages defined it should work. Of course there is a limit, because more pipelining also means higher device utilization. I have a small divider

    example attached.

    Btw. I mixed up some items:

    Pipelining: Means that you or the tool puts some additional registers stage in your design.

    Your latency changed, the result is some clocks cycles later available.

    Retiming: That is an additional feature to speed up your design. With this feature enabled, the synthesis tool tries to move register through your logic in order to improve the clock speed. The latency did not not change. Hopefully your not to confused now. Sorry.

    Kind regards

    GPK