Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
13 years ago

Help with metastability problem

Hi,

I'm working on a vision processing based project at uni using a custom board built around a cyclone III and I am having to modify some vhdl code written by previous years students.

A method for transmitting data from the board had been previously implemented but is very static and hard to modify. So just to test my code I decided to 'Hijack' an area that already writes data out.

in this area, data is written to an instantiated RAM block (of type altsyncram) acting as a buffer. When this has been filled a ready signal is activated and the contents of the RAM block is transmitted via an FTDI interface.

So I setup a block that for the time being fills the RAM block with hardcoded values (2 values that alternate at each clock cycle) while a valid frame is being read from the camera and then when the frame is over, I set the ready signal to active high and trigger the writing process.

the data I am sending is 48 bits and has the form

8-bits : for a color label

10-bits : for x1 coordinate

10-bits : for y1 coordinate

10-bits : for x2 coordinate

10-bits : for y2 coordinate

so I send the following hard coded alternating data (color_label, x1, y1, x2, y2)

data1 : (4, 1, 1, 1, 1)

data2 : (4, 1, 1, 1, 2)

on receiving the data I get random values of either (4,1,1,1,0), (4,1,1,1,1), (4,1,1,1,2) or (4,1,1,1,3). This leads me to think that I am having a problem with metastability and I believe it has something to do with the RAM block (altsyncram) as if I just pass the values continuously to the uploader (bypassing the RAM) i get values as expected, however this is not a viable solution outside of test conditions.

I have attached a picture of my block that is setting the hardcoded values and the RAM block I am writing to.

The code of my block is as follows:

-- INPUTS

FVAL : indicates a valid frame from the camera

DVAL : indicates valid data from the camera

VALID_IN : indicates valid data into this block (currently unused)

buffer_lock : indicates the data in the RAM is being uploaded, so can't write to RAM

LINE_OBJ : the data to write out (currently unused, values are hardcoded for testing)

-- OUTPUTS

buffer_lock_out : used to block the data that use to be writing to the RAM (ignore this)

buffer_rdy : the ready signal that starts the upload process

wren : write enabled signal to the RAM block

wr_addr : the address to write to RAM

obj_count : the data count written to RAM (ignore this, for external purposes)

wr_data : the data to write to RAM

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use work.obj_extraction_pkg.all;
entity OBJ_RamWriter is
  port(
    -- Clock Input
    CLK : in std_logic;
    
    -- Inputs
    buffer_lock     : in std_logic  := '0';
    VALID_IN        : in std_logic  := '0';
    DVAL_IN         : in std_logic  := '0';
    FVAL_IN         : in std_logic  := '0';
    LINE_OBJ        : in std_logic_vector(obj_wd-addr_wd-1 downto 0);
    
    -- Outputs
    buffer_lock_out : out std_logic := '0';
    buffer_rdy      : out std_logic := '0';
    wren            : out std_logic := '0';
    wr_addr         : out unsigned(6 downto 0);
    obj_count       : out unsigned(6 downto 0);
    wr_data         : out std_logic_vector(obj_wd-addr_wd-1 downto 0)
  );
end entity;
architecture rt1 of OBJ_RamWriter is
  -- Data Registers
  signal lock_reg   : std_logic := '0';
  signal v_reg      : std_logic := '0';
  signal fvalid_reg : std_logic := '0';
  signal line_reg   : std_logic_vector(obj_wd-addr_wd-1 downto 0);
  
  -- Internal States
  signal count        : unsigned(6 downto 0)  :=  to_unsigned(0,7);
  signal addr         : unsigned(6 downto 0)  :=  to_unsigned(0,7);
  signal rdy_reg      : std_logic := '0';
  signal rdy_reg_temp : std_logic := '0';
  signal odd          : std_logic := '0';
  
  -- Output registers
  signal lock_out_reg   : std_logic := '0';
  signal valid_out_reg  : std_logic := '0';
  
BEGIN
  process (CLK)
    BEGIN
      if (rising_edge(CLK)) then
        -- Store inputs
        lock_reg <= buffer_lock;
        v_reg <= '1';--VALID_IN;
        if (DVAL_IN = '1' and FVAL_IN = '1') then
          fvalid_reg <= '1';
        else
          fvalid_reg <= '0';
        end if;
        if (odd = '0') then
          odd <= '1';
          line_reg <= std_logic_vector(to_unsigned(4, 8) & to_unsigned(1,10) & to_unsigned(1,10) & to_unsigned(1,10) & to_unsigned(1,10));--LINE_OBJ;
			 --wr_data <= std_logic_vector(to_unsigned(4, 8) & to_unsigned(1,10) & to_unsigned(1,10) & to_unsigned(1,10) & to_unsigned(1,10));--LINE_OBJ;
        else
          odd <= '0';
          line_reg <= std_logic_vector(to_unsigned(4, 8) & to_unsigned(1,10) & to_unsigned(1,10) & to_unsigned(1,10) & to_unsigned(2,10));
			 --wr_data <= std_logic_vector(to_unsigned(4, 8) & to_unsigned(2,10) & to_unsigned(2,10) & to_unsigned(2,10) & to_unsigned(2,10));--LINE_OBJ;
		  end if;
      end if;
  end process;
  
  process (lock_reg, v_reg, fvalid_reg, line_reg)
    BEGIN
      if (lock_reg = '0') then  -- Buffer not being read by uploader
        -- Prevents any other output but lines
        lock_out_reg <= '1';
        if (fvalid_reg = '1') then  -- Frame Data to be processed exists 
          rdy_reg_temp <= '0';
          if (rdy_reg = '1') then -- Buffer upload complete, reset
            addr <= to_unsigned(0,7);
            if (v_reg = '1') then -- Valid object ready to be written to buffer
              valid_out_reg <= '1';
              count <= to_unsigned(1,7);
            else  -- No valid object
              valid_out_reg <= '0';
              count <= to_unsigned(0,7);
            end if;
          else  -- Normal writting state
            if (v_reg = '1' and count < to_unsigned(127,7)) then -- Valid object ready to be written to buffer
              addr <= addr + 1;
              count <= count + 1;
              valid_out_reg <= '1';
            else  -- No valid object
              addr <= addr;
              count <= count;
              valid_out_reg <= '0';
            end if;
          end if;
        else  -- No valid frame data left, start upload
          rdy_reg_temp <= '1';
          addr <= addr;
          count <= count;
          valid_out_reg <= '0';
        end if;
      else  -- Buffer being read by uploader
        lock_out_reg <= '1';
        count <= count;
        addr <= addr;
        rdy_reg_temp <= '1';
        valid_out_reg <= '0';
      end if;
  end process;
  
  rdy_reg <= rdy_reg_temp;
  buffer_lock_out <= lock_out_reg;
  buffer_rdy <= rdy_reg_temp;
  wren <= valid_out_reg;
  wr_addr <= addr;
  obj_count <= count;
  wr_data <= line_reg;   
end rt1;

I have a feeling that I may be violating the setup of hold times of the RAM block but I do not know how to verify this or how to fix it. Any ideas/suggestions would be greatly appreciated and I would be happy to provide any additional information.

Thanks,

Mat.

15 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    So I should run the simulation with my clock speed at frequency of the FPGA and that should let me know if its will run at those speeds? or will I have to do some other timing analysis as well?

    # Edit#

    I tried the simulation, the current clock of the system is 36.15 MHz because the clock is synchronized with the clock from the camera and that is the clock for the camera. So my previous simulations were operating at clock speeds much faster however I adjusted my clock speed used in simulation to be

    (1/36150000) / 2 ~= 14 ns between every edge (rising and falling)

    and everything works as expected in simulation. Is this enough for timing anyalysis or do I need to do more?

    Thanks,

    Mat.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    If you are doing a functional simulation (ie, just testing your code to make sure it works) clock speeds are mostly unimportant. I will generally just use a 100 MHz clock regardless of my final clock because it's easier to work out how many clock cycles have occured between two points when I put two cursers up.

    BUT. If you have more than 1 clock in the system, it is very important to try and get the ratios of the two clocks as close to the real ratios as possible, to ensure data rates are correct and fifos etc dont overfill.

    If you are doing a gate level simulation, then yes, you need to use the real clock speeds, as this should point out any timing problems. But usually most problems are picked up at the functional stage after which you move into synthesis and timing analysis.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    So what would my next best step be?

    Should I try gate level simulation, or should I move onto synthesis and timing analysis with my new design? I never done either gate level simulations or timing analysis before so a point in the right direction would also be appreciated :)

    Thanks,

    Mat.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I honestly have never done a post P&R simulation. With good design practice, a good testbench and good timing analysis specs, you shouldnt need to do one with a fully synchronous design.

    The gate level sim is only really needed when you need to test external interfaces or where you have asynchronous logic. A fully synchronised design shouldnt normally need a gate level sim.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Thanks for all the help. I haven't been able to get anything out from the board but I'm meeting with a lecturer that knows vhdl. So hopefully he'll be able to help as he can take a look at the actual system.

    Thanks again to everyone for helping me out.

    Mat.