Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
13 years ago

Faster: reg mem[] or M9K?

I'm not meeting timing on my design. It takes over 8ns to pull data out of a 'reg' based 128 deep mem[] (see attached Timequest report). I'm considering ways to speed things up and use of M9K blocks came to mind. I tried the 'ramstyle' attribute but it was not accepted by QII unless my mem[] access conform to Alteras coding style.

Will use of M9K RAM blocks significantly speed up the design compared to storing data in regular registers (i.e. 'reg')?

5 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    looks like there is a long combinatorial path. how about increasing the pipelining?

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Will use of M9K RAM blocks significantly speed up the design compared to storing data in regular registers (i.e. 'reg')?

    --- Quote End ---

    The huge mux chain involved with a register based memory array will considerably slow down the data access. Block RAM is faster in this situation.

    RAM inference coding "style" reflects primarly hardware requirements. The first point to find out is if the intended design topology is implementable with block RAM or can be modified accordingly.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    The huge mux chain involved with a register based memory array will considerably slow down the data access. Block RAM is faster in this situation.

    --- Quote End ---

    Yes, this is exactly what i suspected but needed to have it confirmed before putting in work to convert my reg based mem array design to use block RAM.

    Can you tell me what I need to do to make my verilog "compliant" with the QII requirements such that I can convert my mem[] based design to use M9K storage? Would use of a HDL template as the one shown below make this use M9K blocks?

    // Quartus II Verilog Template
    // Simple Dual Port RAM with separate read/write addresses and
    // single read/write clock
    module simple_dual_port_ram_single_clock
    # (parameter DATA_WIDTH=8, parameter ADDR_WIDTH=6)
    (
    	input  data,
    	input  read_addr, write_addr,
    	input we, clk,
    	output reg  q
    );
    	// Declare the RAM variable
    	reg  ram;
    	always @ (posedge clk)
    	begin
    		// Write
    		if (we)
    			ram <= data;
    		// Read (if read_addr == write_addr, return OLD data).	To return
    		// NEW data, use = (blocking write) rather than <= (non-blocking write)
    		// in the write assignment.	 NOTE: NEW data may require extra bypass
    		// logic around the RAM.
    		q <= ram;
    	end
    endmodule

    Tricky, I can't pipeline my design further because the failing path shown by the TimeQuest screenshot is actually just to get the data out of the array and into the target register (i.e. reg [31:0] reg_rx_tlp_bus <= mem[index]; )

    EDIT: I'm looking into using a Megawizard-generated "Simple Dual-port RAM" in my design. It should work fine since my memory actually need to write 64-bits but read 32-bits. I will have to do a fair bit of redesign and re-testing but it should be worth the work if it results in significantly better performance. Previously, I had to pipeline my design just to get the data out of my array-based memory in one cycle. I'm hoping to be able to gain one or two clocks of performance per read in the new design.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Yes, the Quartus templates are a good starting point for writing block ram compatible code.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    FYI: I ended up using a Megawizard-generated 2-port altsyncram module backed by M9K blocks. This works very well.

    Thanks.