Forum Discussion

Scarlet's avatar
Scarlet
Icon for New Contributor rankNew Contributor
1 year ago

Strange issues on FPGA

Hi All,

I wrote a simple finite state machine Verilog code and ran it on the FPGA, but it never runs stably.

My environment:

- MAX10 10m08 EVB

- Quartus Prime Lite 23.1.1

My Verilog Code:

module top (
	input wire clk,            // Clock signal
	input wire rst,            // Reset signal
	
//	input wire enable_1,
//	input wire enable_2,
	
	input wire DIO_Tick,
	output reg Tick_FPGA_1,
//	output reg Tick_FPGA_2,
	
//	input wire rx_1,
//	input wire rx_2,
	input wire data_valid,
	input wire [11:0] data_in,
	output reg [11:0] data_out,
	
//	output wire gpio_1,
//	output wire gpio_2,
	output reg led_1, led_2, led_3, led_4
);
	
	reg [2:0] state;
	reg [3:0] counter;
	
	parameter IDLE = 3'b000;
	parameter START = 3'b001;
	parameter WAIT = 3'b010;
	parameter BUSY_1 = 3'b011;


	always @(posedge clk or negedge rst) begin
		
		if(!rst) begin
			Tick_FPGA_1 <= 1;
//			Tick_FPGA_2 <= 1;
			state <= IDLE;
		end
		
		else begin
			case(state)
				
				IDLE: begin
					led_1 <= 0;
					led_2 <= 1;
					led_3 <= 1;
					led_4 <= 1;
					if(!DIO_Tick) begin
						state <= START;
					end					
				end
				
				START: begin
				
					Tick_FPGA_1 <= 0;
//					counter <= counter + 4'h1;
					led_1 <= 1;
					led_2 <= 0;
					led_3 <= 1;
//					if(counter == 4'h4) begin
//						counter <= 0;
						state <= WAIT;
//					end					
				end
				
				WAIT: begin
					state <= BUSY_1;
				end
				
				BUSY_1: begin
					led_1 <= 1;
					led_2 <= 1;
					led_3 <= 0;
					Tick_FPGA_1 <= 1;
					
					if(!data_valid) begin
						data_out <= data_in;
						state <= IDLE;
					end
				end
			endcase
		end
	end

endmodule

It encounters two issues:

1. Tick_FPGA cannot return to a high level; based on the LED status, it does not correctly transition to the BUSY_1 state.

2. It does not correctly receive data_valid, causing it to get stuck in the BUSY_1 state and unable to return to the IDLE state.

I've tested many methods, such as:

For Issue 1, I originally used a counter to maintain Tick_FPGA = 0 for a while before transitioning states, but I changed it to not use a counter.

For Issue 2, I extended the duration of data_valid, switched to edge detection, and implemented debouncing to wait for stability, but it still cannot run stably.

I'm out of options and need your help.

15 Replies

  • FvM's avatar
    FvM
    Icon for Super Contributor rankSuper Contributor

    Hi Scarlet,
    it's quite simple, any signal originated outside the FSM clock domain and used to advance state must be synchronized to FSM clock.

    @anonimcs wrote

    It's always good to add a default state to prevent (and more importantly recover from) these as much as possible.

    Unfortunately, a default state will not help to recover from illegal states. If there's no condition advancing to a certain state, it will be simply discarded in synthesis.

    To achieve this, you need to specify safe state machine encoding by synthesis attribute

    type state_type is (s0, s1, s2, s3);
    signal state : state_type := s0;
    attribute syn_encoding : string;
    attribute syn_encoding of state_type : type is "safe";

    Review "State Machine HDL Guidelines" in Quartus Design Manual for details.



  • Hi,

    Firstly, could you add the snippets a bit close-up ? It's really hard to see the details, therefore to debug. Regarding the issues:

    1) If the Tick_FPGA is staying low, your FSM might be stuck at the IDLE state, waiting for Dio_tick. In the second snippet I see that there's a rapid transition of Dio_tick to 0 and then back to 1, maybe this transition happens when the FSM is not in the IDLE state but in another state. But since there's no state signal in the waveforms, I cannot be sure. Maybe you can add that one and the counter signals to the waveform ?

    2) What is DATA in A4 ? data_in or data_out ? If it's data_in, it could be the same as 1), getting the data_valid transition to 0 in another state. But if it's data_out, it should be stuck in the BUSY_1 state indeed, but that shouldn't be the case given that the data_valid is wide enough for the clock to catch.

    My idea is that you're stuck at the reset state for the most of the time, and when it's out of that, the FSM barely has time to do anything (I assume A4 is data_in)

    • FvM's avatar
      FvM
      Icon for Super Contributor rankSuper Contributor
      I agree with above post that shown timing recordings don't give much information to debug the issue, except for the simple fact that the state machine is stuck.

      That's not strange but a well-known effect of state machines reading asynchronous input signals without necessary synchronizer chain. If the input changes simultaneous with clock edge, the FSM can jump to an illegal state and possibly never leave it.
      • Scarlet's avatar
        Scarlet
        Icon for New Contributor rankNew Contributor

        @FvM wrote:
        I agree with above post that shown timing recordings don't give much information to debug the issue, except for the simple fact that the state machine is stuck.

        I've added additional explanations in the previous response.


        @FvM wrote:
        That's not strange but a well-known effect of state machines reading asynchronous input signals without necessary synchronizer chain. If the input changes simultaneous with clock edge, the FSM can jump to an illegal state and possibly never leave it.

        Yes, I learned after working with FPGA that if the input changes simultaneously with the clock edge, it can cause errors. That's why I tried adding synchronization to wait for the external signal to stabilize, but it seems to not resolve the issue of not receiving the external data_valid signal.

        		data_valid_sync1 <= data_valid;
        		data_valid_sync2 <= data_valid_sync1;
    • Scarlet's avatar
      Scarlet
      Icon for New Contributor rankNew Contributor

      Let me explain the process of this FPGA:

      1. IDLE: After receiving the DIO_Tick signal, it transitions to the START state.
      2. START: It sends a low-level Tick_FPGA signal, then enters a delay state (originally using a counter), and transitions to the BUSY state.
      3. BUSY: It returns the Tick_FPGA signal to high level and waits for the data_valid signal, before going back to IDLE.

      (CNVST, BUSY, CS, SCK, and DATA are mainly used to confirm whether the local side has received the Tick_FPGA signal and can be ignored.)

      The following is the correct timing:

      (This data_valid signal has been modified in both sending and receiving)


      @anonimcs wrote:

      Hi,

      Firstly, could you add the snippets a bit close-up ? It's really hard to see the details, therefore to debug. Regarding the issues:

      1) If the Tick_FPGA is staying low, your FSM might be stuck at the IDLE state, waiting for Dio_tick. In the second snippet I see that there's a rapid transition of Dio_tick to 0 and then back to 1, maybe this transition happens when the FSM is not in the IDLE state but in another state. But since there's no state signal in the waveforms, I cannot be sure. Maybe you can add that one and the counter signals to the waveform ?




      It encounters two issues:

      1. Tick_FPGA cannot return to a high level:

      Overview of the timing diagram:

      Zoom In(LEDs to IO to observe the state):

      Here, the Tick_FPGA signal suddenly goes low.

      According to the FSM's LEDs I set for each state, it has jump to an illegal state.

      2. FPGA does not correctly receive data_valid

      It can be seen that the preceding timing is correct, but sometimes data_valid is not received, causing the FPGA to get stuck.

      According to the FSM's LED indicators, it is stuck in the BUSY state waiting for data_valid.

      I have tried adding synchronization to wait for the external signal to stabilize, but it seems to not resolve the issue:

      		data_valid_sync1 <= data_valid;
      		data_valid_sync2 <= data_valid_sync1;

      @anonimcs wrote:

      2) What is DATA in A4 ? data_in or data_out ? If it's data_in, it could be the same as 1), getting the data_valid transition to 0 in another state. But if it's data_out, it should be stuck in the BUSY_1 state indeed, but that shouldn't be the case given that the data_valid is wide enough for the clock to catch.

      My idea is that you're stuck at the reset state for the most of the time, and when it's out of that, the FSM barely has time to do anything (I assume A4 is data_in)


      -> I'm sorry for the confusion. DATA is the signal sent by this FPGA to another FPGA using the Tick_FPGA signal to drive the ADC's SPI DATA.

      This FPGA's data is transmitted in parallel (originally, I intended to use serial, but the code I wrote didn't meet expectations; that's another issue). Due to the large number of channels required for 12 bits, I used SPI's DATA for confirmation.

      	input wire [11:0] data_in,
      	output reg [11:0] data_out,

      I really can't find a way to solve these two problems, so I'm hoping to get some help.

      • FvM's avatar
        FvM
        Icon for Super Contributor rankSuper Contributor

        Hi,
        we see that unexpected state enters on DIO_Tick edge. Where is this signal originated? Guess it's also asynchronous and needs synchronization.

  • Scarlet's avatar
    Scarlet
    Icon for New Contributor rankNew Contributor

    I have a question: why can the FSM jump to an illegal state if the input changes simultaneously with the clock edge?

    • FvM's avatar
      FvM
      Icon for Super Contributor rankSuper Contributor

      Good question. Main reason is clock and data path delay skew for multiple state variable registers so that one registers "sees" old and other new input state. A state transition from "1000" to "0100" can result in either "0000" or "1100" illegal state. Depending on the implemented logic, the state machine can only recover by a reset. Above mentioned safe state machine logic enforces a transition from illegal to legal, usually initial state.

  • Scarlet's avatar
    Scarlet
    Icon for New Contributor rankNew Contributor

    I have another question. Originally, I had a Tx/Rx transmission module receiver, but I encountered a deadlock issue during testing, so I switched to parallel transmission. This time, I deeply realized that when receiving external signals, it’s important to process and wait for stability first.

    I modified the module receiver using the same concept, and although it doesn't lead to illegal states causing deadlock, it sometimes fails to correctly receive data from Rx.

    I am currently testing and inferring that the issue is also caused by the input changing simultaneously with the clock edge.

    My Tx/Rx architecture:

    My Tx transmission format is:

    Start bit + 12bit data(LSB to MSB) + Stop bit = total 14bit, refer to:

    My Receiver Code:

    module receiver.v

    Result

    On the left is the SPI data, which is sent through Tx after reversing the MSB, and on the right is the data received by Rx:

    I have confirmed that the timing and state are both normal:

    First data:

    Start bit Tx DataStop bitRx Data
    00110 0000 001010x406

    Sixth data:

    Start bit Tx DataStop bitRx Data
    01110 0000 001010x007

    Observing the execution time of the shift_reg based on the changes in GPIO.

    			if(current_state == RECEIVE) begin
    				shift_reg <= {RX_sync, shift_reg[11:1]};
    				bit_count <= bit_count + 1;
    				gpio <= ~gpio;
    			end

    Due to the use of a synchronization register, the GPIO will be delayed by one clock cycle.

    	//* ----- Synchronization Register ----- *//
    	reg [1:0] rx_sync;
    
    	// Registers are updated on every rising clock edge
    	always @(posedge clk) begin
    		 // Synchronize the external signal step by step to the FPGA internal clock
    		 rx_sync <= {rx_sync[0], rx};
    	end
    
    	// Sync Signal
    	wire RX_sync = rx_sync[1];

    I initially had no way to solve this issue, but referring to a recent discussion[1], if I use oversampling during Rx reception, it should help avoid the problem of the input Rx changing simultaneously with the clock edge.

    So generally, when designing a receiver, oversampling is used to solve this issue?

    [1] SYNCHRONIZED RS232 DATA CORRUPTED

  • RichardT_altera's avatar
    RichardT_altera
    Icon for Super Contributor rankSuper Contributor

    Hi,


    May I know if the issue has been resolved, or if you still need assistance with this case?


    Regards,

    Richard Tan


    • Scarlet's avatar
      Scarlet
      Icon for New Contributor rankNew Contributor

      Yes, the issue has been resolved.
      Thanks.

  • RichardT_altera's avatar
    RichardT_altera
    Icon for Super Contributor rankSuper Contributor

    Thank you for the confirmation.


    Now, I will transitioning this thread to community support. If you have any further questions or concerns, please don't hesitate to reach out. Please login to ‘https://supporttickets.intel.com’, view details of the desire request, and post a feed/response within the next 15 days to allow me to continue to support you. After 15 days, this thread will be transitioned to community support.

    The community users will be able to help you on your follow-up questions.


    Thank you and have a great day!


    Best Regards,

    Richard Tan