State machine crashes (Cyclone II) - no idea why. How do I debug?
I have implemented bit mapped VGA graphics using a CycloneII and fast 12ns external SRAM, connected to a MC68000. I am trying to get EMUTOS to run on this hardware, and it does boot to the desktop, however I appear to have a problem which I have now confirmed in the SRAM arbiter state machine crashing. When this happens I lose the picture.
When I make changes to code, sometimes it works and sometimes it doesn't. It feels like the hardware equivalent of a wayward software pointer stomping over memory 😐 . I have had this problem for months since I first wrote the code. It happens on my mk1 144 QFP pin CycloneII homebrew board, and not it happens on my mk2 208 pin CycloneII board. I now have enough pins for 8 diagnosttics LEDs which helped confirm the suspicion (I bring out the state variable to the 8 LEDs). I have also now captured the event on my scope.
Timing is clean. The clock to the state machine is from the 50MHz Xtal. The video pixel clock is 25MHz which is simply the Xtal divided by 2. The 68k runs at 10MHz. I am very familiar with the issue of signals crossing clock domains. The state machine is an arbiter which allows access to either the video generator or the 68k.
Most of the time it simply sites in the idle state, but every 160ns there is one cycle to fetch the next pixel.
I have carefully written the state machine to avoid any missing next state values.
reg [7:0] r_arb_state;
localparam VRAM_ARB_STATE_IDLE = 8'b00000000;
localparam VRAM_ARB_STATE_READPIXEL = 8'b00000001;
localparam VRAM_ARB_STATE_S1 = 8'b00000011;
localparam VRAM_ARB_STATE_S2 = 8'b00000111;
localparam VRAM_ARB_STATE_S3 = 8'b00001111;
localparam VRAM_ARB_STATE_WRITEIDLE = 8'b00011111;
localparam VRAM_ARB_STATE_S4 = 8'b11000011;
localparam VRAM_ARB_STATE_S5 = 8'b11000111;
localparam VRAM_ARB_STATE_S6 = 8'b11001111;
localparam VRAM_ARB_STATE_ILLEGAL = 8'b11111111;In the picture below the state machine crash can be seen.
No - adding a picture does not seem to work with my browser and the Intel pop up window 😦
Instead of staying in the IDLE state (00000000) it changes to (11000000) which is an undefined state. From here the "default" takes it to the ILLEGAL state (11111111) where it stays (and video is lost). The blue low going signals are the SRAMs not chip select.
always @ (posedge w_vram_clk)
begin
if (IO_RSTN==1'b0) begin
r_arb_state <= VRAM_ARB_STATE_IDLE;
r_pixelread_done <= 1'b0;
r_writeeven_done <= 1'b0;
r_writeodd_done <= 1'b0;
VRAM_addr <= vga_addr;
VRAM_dataout <= vga_dataout;
VRAM_cs <= vga_cs;
VRAM_we <= vga_we;
end else begin
case (r_arb_state)
VRAM_ARB_STATE_IDLE:
begin
r_pixelread_done <= 1'b0;
r_vram_writeaddreven <= #3 {r_VideoRamOffset,1'b0}; // 14:0 = 32k bytes
r_vram_writeaddrodd <= #3 {r_VideoRamOffset,1'b1}; // 14:0 = 32k bytes
// does the video generator need to read pixel data? (occurs at regular intervals)
if (vga_cs==1'b0) begin
VRAM_addr <= vga_addr;
VRAM_dataout <= vga_dataout;
VRAM_cs <= vga_cs;
VRAM_we <= 1'b1; // was vga_we - but not used outside of reset
r_arb_state <= VRAM_ARB_STATE_READPIXEL;
end else begin
VRAM_cs <= 1'b1;
VRAM_we <= 1'b1;
VRAM_addr <= VRAM_addr;
VRAM_dataout <= VRAM_dataout;
if ((r_writeeven_done==1'b0)&&(r_write_uds==1'b1)) begin
r_arb_state <= VRAM_ARB_STATE_S1;
end else begin
if ((r_writeodd_done==1'b0)&&(r_write_lds==1'b1)) begin
r_arb_state <= VRAM_ARB_STATE_S4;
end else begin
r_arb_state <= VRAM_ARB_STATE_IDLE;
if (r_write_uds==1'b0) begin
r_writeeven_done <= 1'b0; // acked
end else begin
r_writeeven_done <= r_writeeven_done;
end
if (r_write_lds==1'b0) begin
r_writeodd_done <= 1'b0; // acked
end else begin
r_writeodd_done <= r_writeodd_done;
end
end
end
end
end
I can't understand why the state machine is crashing. Timing is good. 50MHz is not excessive for this chip. I have pondered whether it might be a power supply glitch, but the +5V is from a good Rigol PSU and this is regulated to 1.2V by a regulator on the PCB. I have plenty of decoupling caps.
I am desperate for ideas as to how to diagnose the problem. I am pulling out my hair, and so I am reaching out to this forum out of desperation!
State machines usually crash because they entered into an illegal, undocumented state.
This can happen when an input signal that is sampled is either asynchronous, or poorly synchronized.
The signal goes to two separate parts of the state machine transition logic, and is interpreted as a H in one part, and a L in the other.
This can then cause a transition to an illegal state.
You don't show enough of your code to know how this might apply in your case. The module header, and how all input signals are generated, is necessary to know.
Quartus will on occasion re-encode the state machine in another form (usually one-hot) where each defined state is implemented as being encoded with just one state bit set. You need to look in your report files to see if this was done (or not). Doing this can be disabled by user control.
As you mention synchronous clock timing can also be a cause, but Quartus should be able to tell you which paths did not meet your 50MHz timing (if any).