Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
14 years ago

Verilog - how to write parameterized RAM that infers a RAM block?

Does anyone know a way to write a RAM in Verilog with parameterized width and byte enables that causes Quartus to infer a RAM block?

Since Altera seems to be strongly discouraging direct use of altsyncram by removing all documentation, I am trying to find a way to directly code RAM. I do not need special features like initialization from a file, but i do need basic parameterization.

I have written a simple test which instantiates two RAM blocks. This first is inferred correctly but is not parmetrized. The second is properly parametrized but is not inferred. For the second, Quartus outputs the following:

Info (276007): RAM logic "ram2:m2|ram" is uninferred due to asynchronous read logic

Any solutions or ideas to try would be greatly appreciated.

** edit for clarification

Quartus does infer an altsyncram megafunction from the second RAM instantiation, but uses lcell registers instead of a RAM block.

** /edit


// top level, just instantiate two test cases
module ram_test#(
    parameter data_width = 32,
    parameter addr_width = 6,
    parameter bena_width = data_width / 8 
) (
    input clk,
    input  addr,
    input wena1,
    input wena2,
    input  wdata,
    input  bena,
    output  q1,
    output  q2
);
    ram1# (.data_width(data_width), .addr_width(addr_width)
    ) m1 (.clk(clk), .addr(addr), .wena(wena1), .wdata(wdata), .bena(bena),
            .q(q1)
    );
    ram2# (.data_width(data_width), .addr_width(addr_width)
    ) m2 (.clk(clk), .addr(addr), .wena(wena2), .wdata(wdata), .bena(bena),
            .q(q2)
    );
endmodule
//---------------------------------------------------------------------------
// Hard coded width, infers correctly
//---------------------------------------------------------------------------
module ram1# (
    parameter data_width = 32,  // works only for 32 bit
    parameter addr_width = 10,
    parameter bena_width = data_width / 8 
) (
    input clk,
    input  addr,
    input wena,
    input  wdata,
    input  bena,
    output reg  q
);
    localparam numwords = 2**addr_width;
    reg  ram ;
    always_ff@(posedge clk) begin
        if(wena) begin
            if(bena) ram <= wdata;
            if(bena) ram <= wdata;
            if(bena) ram <= wdata;
            if(bena) ram <= wdata;
        end
        q <= ram;
    end
endmodule
//---------------------------------------------------------------------------
// Parameterized width, does not infer a RAM block
//---------------------------------------------------------------------------
module ram2# (
    parameter data_width = 32,  // can be any multiple of 8
    parameter addr_width = 10,
    parameter bena_width = data_width / 8
) (
    input clk,
    input  addr,
    input wena,
    input  wdata,
    input  bena,
    output reg  q
);
    localparam numwords = 2**addr_width;
    reg  ram ;
    // create full write enable bit mask
    wire  wmask;
    genvar bytelane;
    generate
        for(bytelane=0; bytelane < bena_width; bytelane++) begin : lpbl
            assign wmask = wena ? {8{bena}} : 8'b0;
        end
    endgenerate
    // RAM
    always_ff@(posedge clk) begin
        ram <= (wmask & wdata) | (~wmask & ram);
        q <= ram;
    end
endmodule

20 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Maybe I'm missing something obvious here, and I have no experience in Verilog, but what is preventing you from putting ifs in the for loop instead of this combinatorial assignment?

    As for the size when using separate 8-bit blocks, the only limitation I see is that Quartus will use a minimum of bena_width blocks to infer the memory. Yes it can be a problem if you are using large memory blocks and only need a small buffer.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Can you please give an example in which regard the recommended hdl style uses more RAM than required according to the hardware properties?

    --- Quote End ---

    The recommended HDL style does not use more RAM than required. The suggestion by Daixiwen to make Quartus infer a seperate 8-bit wide RAM block for each byte enable uses more RAM than required when the RAM size needed is less than the (block size) * (number of byte enables).
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Maybe I'm missing something obvious here, and I have no experience in Verilog, but what is preventing you from putting ifs in the for loop instead of this combinatorial assignment?

    As for the size when using separate 8-bit blocks, the only limitation I see is that Quartus will use a minimum of bena_width blocks to infer the memory. Yes it can be a problem if you are using large memory blocks and only need a small buffer.

    --- Quote End ---

    I think the for loop cannot be inside of the always_ff block. I will try it tomorrow and report back. If it cannot, then the only way to get the ifs in the for loop is to put the always_ff block in the for loop. This results in your original suggestion to create a separate memory for each byte enable.

    The issue of RAM block waste makes the separate block implementation suitable "sometimes", for example if i need 1kB on a 64-bit bus and my device has M10K blocks then it uses 8 RAM blocks when it should use 1. The reason i want a RAM implementation with working parameters is so that i can write a generic module for a library that uses RAM as one of many components without having to use different code depending on the parameter values.

    I think the best workaround for now may be to write a set of specific width RAM modules and use generate/if to select the module to use based on the width requested.

    ps. Judging from the templates, I think VHDL has the same issue. The Altera VHDL templates for RAM with byte enables also requires the body to be changed if the data width is changed.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    The recommended HDL style does not use more RAM than required. The suggestion by Daixiwen to make Quartus infer a seperate 8-bit wide RAM block for each byte enable uses more RAM than required when the RAM size needed is less than the (block size) * (number of byte enables).

    --- Quote End ---

    Daxiwen mentioned the byte enable example in the "Recommended HDL Style" document. You say, that the recommended HDL style does not use more RAM than required. So what's the problem?

    Did you try the method suggested in the document?

    P.S.:

    --- Quote Start ---

    It "works", but the parameterization is broken in that the module body must be changed to support a different data width.

    --- Quote End ---

    What do you mean with module body? The memory module or calling function? Accessing the RAM in bytes as required by the recommended coding style involves a change in the memory module. But it can be completely hidden inside the module.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Judging from the templates, I think VHDL has the same issue. The Altera VHDL templates for RAM with byte enables also requires the body to be changed if the data width is changed.

    --- Quote End ---

    No I'm pretty sure in VHDL i could make a generic memory block with byte enables. Something like this:
    process(clk)
      begin
        if(rising_edge(clk)) then 
          if(we = '1') then
            for block_num in 0 to bena_width-1 loop
              if(be(block_num) = '1') then
                ram(waddr)(block_num) <= wdata((block_num*8)+7 downto (block_num*8));
              end if;
            end loop;
          end if;
        q_local <= ram(raddr);
      end if;
    end process;
    I haven't tested it, but it should be recognized properly by the synthesizer.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Daxiwen mentioned the byte enable example in the "Recommended HDL Style" document. You say, that the recommended HDL style does not use more RAM than required. So what's the problem?

    Did you try the method suggested in the document?

    --- Quote End ---

    Yes, the "ram1" module in the code in my original post is the method suggested in the document. The problem is that the "data_width" parameter doesn't work.

    --- Quote Start ---

    What do you mean with module body? The memory module or calling function? Accessing the RAM in bytes as required by the recommended coding style involves a change in the memory module. But it can be completely hidden inside the module.

    --- Quote End ---

    I mean the memory module, and yes, it can be hidden inside the module with a set of hard coded memories selected by generate blocks. That seems to be what i need to do. I usually try to avoid that approach if possible because it does not scale well if more than one parameter needs to be handled this way, but in this case it seems to be necessary.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    No I'm pretty sure in VHDL i could make a generic memory block with byte enables. Something like this:

    I haven't tested it, but it should be recognized properly by the synthesizer.

    --- Quote End ---

    Interesting. I don't know VHDL, but that looks like it would do what i need. Because of differences in language structure, the analogous construct in SystemVerilog causes the synthesizer to generate a RAM block per byte lane.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Interesting. I don't know VHDL, but that looks like it would do what i need. Because of differences in language structure, the analogous construct in SystemVerilog causes the synthesizer to generate a RAM block per byte lane.

    --- Quote End ---

    That really sounds like a bug. I didn't expect it. Thanks for clarifying.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    how to write my outputs into a file in verilog, can u plz xplain with an example..... and also read from the file and then my module responds......:confused:

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Use something like this instead.

    Beware: I'm not sure I got the indexes order correct.

        generate
            for(bytelane=0; bytelane < bena_width; bytelane++) begin : lpbl		
    				always_ff @ (posedge clk) begin
    					if (wena && bena) 
    						ram <= wdata;
    					q <= ram;
    				
    				end
    				
            end
        endgenerate