Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
18 years ago

Strange State Machine Compilation - Compiler Bug?

Hi @all,

I'm using Quartus 7.2 and I'm wondering about the implementation of FSMs which are coded with one segment (process) style and havn't a default assignment for the state variable/signal.

If the traditional 3 segment (process) coding style is used it isn't possible to omit this default assignment because the "next_state" must be fully declared inside the process which describes the combinational circuit for the state transitions.

But inside the one segment style I can omit the statement.

example:

process(clk,reset)
	variable reset_timer_var : std_logic;
begin
	if reset = '1' then
		output <= "00";
		reset_timer_var := '0';
		state <= init;
	elsif clk'event and clk = '1' then
		reset_timer_var := '0';
		if trafficLight_ena = '0' then
			reset_timer_var := '1';
			state <= init;
		else
			case state is
				when init =>
					output <= "00";
					if trafficLight_ena = '1' then
						reset_timer_var := '1';
						state <= red;
					-- missing:
					--else
					--	state <= init;
					end if;
				when green=>
					output <= "01";
					if timer_max = '1' then
						state <= yellow;
					-- missing:
					--else
					--	state <= green;
					end if;
				when yellow=>
					output <= "10";
					reset_timer_var := '1';
					state <= red;
				when red=>
					output <= "11";
					if timer_max = '1' then
						reset_timer_var := '1';
						state <= green;
					-- missing:
					--else
					--	state <= init;
					end if;
			end case;
		end if;
		reset_timer <= reset_timer_var;
	end if;
end process;

The State Machine Viewer shows an identical transition table and encoding if the else statements are commented out.

But the synthesis result is different:

If I omit the default assignment the transition from state to itself is implemented with negation of all other states which bloats the state machine and the used FPGA resources.

I havn't figured out if this is a compiler bug or if this implementation is intended by the written code (I don't think so).

Can anyone explain this behavior?

TIA

Axel

9 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello Axel,

    I can't verify resource usage differences using or not using the else statements. I guess the last else should be state <= red, but that also makes no significant difference. Could it be that additional non-default synthesis settings are in effect in your dessign? Also, some defaults may depend on used device, which is unknown.

    In constrast, when using the "safe" FSM coding style, the design needs a few more resources, but two LE less when the else statements are included, actually no big effect.

    If anything causes the reported behaviour, it isn't apparently in the shown code.

    When not coding a safe FSM, I would expect the two cases to produce identical gate level netlists, cause the compiler is free to minimize the logic as far as possible. As an example, before Altera introduced the "safe" option with Quartus, an additional others case, causing a transition to the init state always had been kicked out during synthesis.

    Regards,

    Frank
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello Frank,

    thanks for your fast reply.

    You aren't right. I had forgotten that it only happens in greater FSMs. In the following example I have added one signal to the state transition conditions.

    With else:

    14 LEs, 11 Registers, fmax: 556 Mhz

    Without else:

    15 LEs, 11 Registers, fmax: 373 Mhz

    I use identical synthesis and fitter options. (safe state machine encoding is off; standard one-hot encoding)

    You say "I would expect the two cases to produce identical gate level netlists, cause the compiler is free to minimize the logic as far as possible." - I also expect the same. But it doesn't result in the same Post-Mapping and thats the reason for this thread...

    
    library ieee;
    use ieee.std_logic_1164.all;
    use ieee.numeric_std.all;
    entity performancetest_1_fsm2conditions is
    	port(
    		clk              : in std_logic;
    		reset            : in std_logic;
    		trafficLight_ena : in std_logic;
    		input            : in std_logic;
    		output           : out std_logic_vector(1 downto 0)
    	);
    end entity performancetest_1_fsm2conditions;
    architecture performancetest of performancetest_1_fsm2conditions is
    	type STATES is (init, green, yellow, red);
    	signal state : STATES;
    	signal reset_timer : std_logic;
    	signal timer_max   : std_logic;
    	
    component timer
    	port(
    		clk         : in std_logic;
    		reset       : in std_logic;
    		reset_timer : in std_logic;
    		timer_max   : out std_logic
    	);
    end component;
    begin
    timer1:timer 
    	port map (
    	clk         => clk,
    	reset       => reset,
    	reset_timer => reset_timer,
    	timer_max   => timer_max
    	);
    process(clk,reset)
    	variable reset_timer_var : std_logic;
    begin
    	if reset = '1' then
    		output <= "00";
    		reset_timer_var := '0';
    		state <= init;
    	elsif clk'event and clk = '1' then
    		reset_timer_var := '0';
    		if trafficLight_ena = '0' then
    			reset_timer_var := '1';
    			state <= init;
    		else
    			case state is
    				when init =>
    					output <= "00";
    					if trafficLight_ena = '1' then
    						reset_timer_var := '1';
    						state <= red;
    					end if;
    				when green =>
    					output <= "01";
    					if timer_max = '1' and input = '0' then
    						state <= yellow;
    					end if;
    				when yellow =>
    					output <= "10";
    					reset_timer_var := '1';
    					state <= red;
    				when red =>
    					output <= "11";
    					if timer_max = '1' and input = '1' then
    						reset_timer_var := '1';
    						state <= green;
    					end if;
    			end case;
    		end if;
    		reset_timer <= reset_timer_var;
    	end if;
    end process;
    end performancetest;
    library ieee;
    use ieee.std_logic_1164.all;
    use ieee.numeric_std.all;
    entity timer is
    	port(
    		clk         : in std_logic;
    		reset       : in std_logic;
    		reset_timer : in std_logic;
    		timer_max   : out std_logic
    	);
    end entity timer;
    architecture timercnt of timer is
    begin
    timecnt:process(clk, reset)
    	variable timer : integer range 0 to 7;
    begin
    	if reset = '1' then
    		timer := 0;
    	elsif clk'event and clk = '1' then
    		if reset_timer = '1' then
    			timer := 0;
    		else
    			timer := timer + 1;
    		end if;
    		if timer = 7 then
    			timer_max <= '1';
    		else 
    			timer_max <= '0';
    		end if;
    	end if;
    	
    end process;
    end timercnt;
    
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello Axel,

    I can verify the one LE difference you reported and see, that postfit netlists are different. But I don't see a timing difference, if I would, as long as timing isn't a critical application parameter, I would regard it a minor implementation detail. Usually other parts of a design will be more an object to speed optimization.

    I think, the test doesn't show much more than arbitrary results of the ambiguous "one state hot" coding. In contrast, if you define "sequential" coding style, the two test cases result in identical post mapping netlists (as far as I see) because the unique coding has no degree of freedom (at least with 2^n states).

    Thus I I don't see clear conclusions for FSM coding from the test, e. g. if the redundant "holding" transitions are meaningful somehow. I'll continue to omit them simply for writing economy, the same reason why I generally prefer what you call "one segment" form.

    Thank you for the interesting discussion,

    Frank
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The two descriptions (with and without the explicit else) may have equivalent behavior but they don't create equivalent initial netlists during elaboration. In the explicit else case, the next state logic is fed by constants. Omit the else and the logic is fed by the current state register, creating feedback in the data path for the next state logic. Think of each IF statement as creating a MUX. In the first case, the MUX has two constant inputs. In the second case, the MUX has one constant input and one non-constant input - the current state register.

    Whenever you start from two different netlists, you aren't guaranteed identifcal results. The synthesis engine could be happier optimizing constant inputs. If your netlist has a feedback in the data path, it may trigger a different set of optimizations that produce a different result, possibly better or possibly worse or possibly equivalent.

    If you notice potential for significantly optimizing a real example, file a support request with Altera.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello,

    You need a "feedback in datapath" anyway to hold a present state. Either as input to an "IF" MUX, or as selector for a "CASE" MUX. If you imagine that the compiler merges the two levels (if and case) at the state register input when creating the "initial netlist", the logic would be identical for both cases.

    Regards,

    Frank
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    True, the current state register necessarily feeds the control logic of the selector that likely implements the case. I'm arguing that the feedback in the data lines probably makes a difference. Even if the compiler collapses the select + muxes, you probably won't end up with "identical" logic - it all depends on the smarts behind the algorithm. What matters here isn't functional equivalence but structural equivalence. In many tools, you can get a totally different result simply by switching a couple gates around in a functionally equivalent way. It's just the undeniable nature of heuristic algorithms. Maybe if the compiler collapsed using BDDs...

    In any case, my point was simply that just because two descriptions have the same behavior does not mean they must necessarily have the same QoR in a synthesis tool.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello Axel,

    I can contribute another interesting observation regrading your code example. In the detail below, you see that the if trafficlight_ena = '1' then condition is always true cause it is evaluated after the preceeding if trafficlight_ena = '0' then.

    if trafficLight_ena = '0' then
    	reset_timer_var := '1';
    	state <= init;
    else
    	case state is
    	when init =>
    		output <= "00";
    		if trafficLight_ena = '1' then -- always true, but...
    			reset_timer_var := '1';
    			state <= red;
    		end if;

    Whatever the compiler may synthezise from the redundant IF statement, it produces a third netlist variant if you omit it and a possible else state <= init statement. Try yourself. Furthermore, the coding differences from additional else in green and red state are vanishing. Somewhat surprizing.

    I stumbled on the "dead code", when I tryed to understand, if the different netlists would cause identical behaviour.

    Regards,

    Frank
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello Frank,

    thanks for your interesting observation.

    I continued the investigation and try to summarize my results. I think the state machine viewer has a bug and the compiler inserts redundant logic in special cases:

    I used the code which I posted on January 11th, 2008 05:03 PM.

    I have 4 different versions:

    architecture performancetest of performancetest_1_fsm2conditions_2 is
    	type STATES is (init, green, yellow, red);
    	signal state : STATES;
    	signal reset_timer : std_logic;
    	signal timer_max   : std_logic;
    	
    component timer
    	port(
    		clk         : in std_logic;
    		reset       : in std_logic;
    		reset_timer : in std_logic;
    		timer_max   : out std_logic
    	);
    end component;
    begin
    timer1:timer 
    	port map (
    	clk         => clk,
    	reset       => reset,
    	reset_timer => reset_timer,
    	timer_max   => timer_max
    	);
    process(clk,reset)
    	variable reset_timer_var : std_logic;
    begin
    	if reset = '1' then
    		output <= "00";
    		reset_timer_var := '0';
    		state <= init;
    	elsif clk'event and clk = '1' then
    		reset_timer_var := '0';
    		if trafficLight_ena = '0' then
    			reset_timer_var := '1';
    			state <= init;
    		else
    			case state is
    				when init =>
    					if trafficLight_ena = '1' then -- removed at version 3 & 4
    						output <= "00";
    						reset_timer_var := '1';
    						state <= red;
    					else               -- removed at version 1 & 3 & 4
    						state <= init; -- removed at version 1 & 3 & 4
    					end if;
    				when green =>
    					output <= "01";
    					if timer_max = '1' and input = '0' then
    						state <= yellow;
    					else                -- removed at version 1 & 3
    						state <= green; -- removed at version 1 & 3
    					end if;
    				when yellow =>
    					output <= "10";
    					reset_timer_var := '1';
    					state <= red;
    				when red =>
    					output <= "11";
    					if timer_max = '1' and input = '1' then
    						reset_timer_var := '1';
    						state <= green;
    					else              -- removed at version 1 & 3
    						state <= red; -- removed at version 1 & 3
    					end if;
    			end case;
    		end if;
    		reset_timer <= reset_timer_var;
    	end if;
    end process;
    end performancetest;

    1.: "without else" with redundant if statement

    -> removed redundant else statements "state <= actual state";

    2.: "with else" with redundant if statement = the posted version

    3.: "without else" without redundant if statement

    -> removed redundant condition "if trafficLight_ena = '1' then" inside state init

    (= dead code because there is a "global" if statement outside the case)

    -> removed redundant else statements "state <= actual state";

    4.: "with else" without redundant if statement:

    -> removed redundant condition "if trafficLight_ena = '1' then" inside state init

    (= dead code because there is a "global" if statement outside the case)

    If I use sequential state machine encoding all 4 versions are identical (technology map viewer post fit) and the state machine viewer displays the expected transitions.

    12 LEs; 9 Reg; 494,56 MHz

    
    init	0	0
    green	0	1
    yellow	1	0
    red	1	1
    init	init	(!trafficLight_ena)
    init	red	(trafficLight_ena)
    green	init	(!trafficLight_ena)
    green	green	(!timer:timer1).(trafficLight_ena) + (timer:timer1).(input).(trafficLight_ena)
    green	yellow	(timer:timer1).(!input).(trafficLight_ena)
    yellow	init	(!trafficLight_ena)
    yellow	red	(trafficLight_ena)
    red	init	(!trafficLight_ena)
    red	green	(timer:timer1).(input).(trafficLight_ena)
    red	red	(!timer:timer1).(trafficLight_ena) + (timer:timer1).(!input).(trafficLight_ena)
    

    If I use one hot state machine encoding I get completely different results and I'm quite sure now that something is wrong!

    state machine encoding is equal for all versions:

    init 0 0 0 0

    green 0 0 1 1

    yellow 0 1 0 1

    red 1 0 0 1

    without else = without redundant default state assignment

    1.: "without else" with redundant if statement

    =============================

    15 LEs; 11 Reg; 380,37 Mhz

    - normal state transitions are displayed inside state machine viewer

    - at first glance redundant conditions for state transitions

    -> state.red <= state.red*tmp + !tmp*trafficLight_en*!(timer_max*(input*state.red) + !input*state.green))

    tmp = (trafficLight_en*(!timer_max + !state.green*!input + !state.red*input)*state.init*!state.yellow)

    -> tmp comprises all states! so I'm not sure If the compiler generates redundant logic, but tmp is also used for state transition to state.green

    - if I use optimization option speed I get the identical result as in version 2

    2.: "with else" with redundant if statement = the posted version

    =========================================

    13 LEs; 10 Reg; 551,27 Mhz

    - normal state transitions are displayed inside state machine viewer

    - state.init register optimized away

    - optimization area, balanced and speed are identical

    3.: "without else" without redundant if statement

    ===============================

    14 LEs; 11 Reg; 459,77 Mhz

    - normal state transitions are displayed inside state machine viewer

    - the transition (the data input for register) for state.red "is as coded"

    - optimization area, balanced and speed are identical

    4.: "with else" without redundant if statement

    =============================

    14 LEs; 11 Reg; 459,77 Mhz

    - BUG in state machine viewer:

    all transitions to state.init are missing in the table and in the chart

    - but result is identical to version 3

    If I compare the gate level logic for setting state.red register

    version 1

    state.red <= state.red*tmp + !tmp*trafficLight_en*!(timer_max*(input*state.red) + !input*state.green))

    tmp = (trafficLight_en*(!timer_max + !state.green*!input + !state.red*input)*state.init*!state.yellow)

    with version 2:

    state.red <= trafficLight_ena*!state.green*!(state.red*timer_max*input)

    or version 3 / 4:

    state.red <= traffic_ena*((state.red*!(timermax*input)) + !(!state.yellow*state.init))

    I'm confused about the results, because I coded and intended the same hardware.

    In fact version 1 and 2 have identical RTL Viewer and State machine viewer output. But the results are different. I tested the impact of the missing default state assignments at another bigger component and the effect was the same.

    I would be happy if someone can verify the State Machine Viewer Bug and halp to investigate the impact of (redundant/non redundant) if conditions outside the case statement in correlation with redundant default state assignments.

    TIA,

    Axel
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hello Axel,

    I don't know, if I'll have the time to follow the problem in detail. First I have to continue some customer projects. I started to compare gate level implementation, but found it somewhat annoying. I didn't see an explicite bug yet. Basically, all transitions from illegal states must be considered undefined, unless you don't have a "safe" FSM. At some of these illegal states, the state machine could be stuck forever, if accidently reaching it. As minimal reason, a timing violation on any input signal could cause this situation.

    I expect the implementation differences to show only for illegal states, if this guess is true, you can't say one implementation is right and the other wrong. They're all correct as long as the defined behaviour is exactly implemented. I didn't pay much attention to the FSM viewer, I probably would have missed a bug here.

    By the way: Safe FSM option had been introduced with Quartus with V 7.0, I believe, that's rather fresh. Before the option was included, there was a good chance that Quartus would optimize respectively eliminate any try to implement safe FSM logic. Apparently, FSM support has been neglected for a period of time.

    Regards,

    Frank