Forum Discussion

Geert's avatar
Geert
Icon for New Contributor rankNew Contributor
5 years ago

Inference of DSP block with accumulator does not work

I'm trying to infer a DSP block with accumulator for Arria 10, using Quartus Prime 17.0.

The high-level functionality I need is:

if rising_edge(clk) then
  if sload = '1' then
      out <= a * b;
  else
      out <= out + a * b
  end if;
end if;

I started from the template provided in Quartus: VHDL/Full Designs/Arithmetic/Signed Multiply-Accumulate, but this does not work: it uses a DSP block for the multiplier, but it does not use the accumulator function.

Instead, for small word sizes, it creates a loop back path via the second multiplier inputs to bring the output back to the adder.

When I increase the accumulator width to 48, the accumulator is implemented entirely in LUTs

Any ideas how to force use of the DSP block accumulator (preferably using inference) ?

Thanks, Geert

4 Replies

  • SengKok_L_Intel's avatar
    SengKok_L_Intel
    Icon for Regular Contributor rankRegular Contributor

    Hi Greet,


    If you are using an independent multiplier, could you please increase the input data width to >19 bits? If using lower than 18 bits, it will not fit into the hard accumulator.


    Regards -SK Lim


    • Geert's avatar
      Geert
      Icon for New Contributor rankNew Contributor

      Hi,

      Thanks for your answer.

      I have tried with multiple different input sizes and indeed, 2x 16-bit multiplier inputs fails (accumulator is implemented in LUTs) , while with a 16-bit + a 24-bit input, I got the expected implementation (hard accumulator).

      Could you explain what the exact criterion is? Is it the multiplier result that needs to have a minimal width, or is it just sufficient that one of the multiplier inputs is > 18 bits?

      regards,

      Geert

  • SengKok_L_Intel's avatar
    SengKok_L_Intel
    Icon for Regular Contributor rankRegular Contributor

    Hi,


    I use the template below and change the width to 27, and it can fit the accumulator into the hard block. Perhaps, you may try to use the ALTERA_MULT_ADD to configure the accumulator to the mode that you needed.

    Please refer to Table 25 for the accumulator function:

    https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/arria-10/a10_memory.pdf



    // Quartus Prime Verilog Template

    // Unsigned multiply-accumulate


    module unsigned_multiply_accumulate

    #(parameter WIDTH=27)

    (

    input clk, aclr, clken, sload,

    input [WIDTH-1:0] dataa,

    input [WIDTH-1:0] datab,

    output reg [2*WIDTH-1:0] adder_out

    );


    // Declare registers and wires

    reg [WIDTH-1:0] dataa_reg, datab_reg;

    reg sload_reg;

    reg [2*WIDTH-1:0] old_result;

    wire [2*WIDTH-1:0] multa;


    // Store the results of the operations on the current data

    assign multa = dataa_reg * datab_reg;


    // Store the value of the accumulation (or clear it)

    always @ (adder_out, sload_reg)

    begin

    if (sload_reg)

    old_result <= 0;

    else

    old_result <= adder_out;

    end


    // Clear or update data, as appropriate

    always @ (posedge clk or posedge aclr)

    begin

    if (aclr)

    begin

    dataa_reg <= 0;

    datab_reg <= 0;

    sload_reg <= 0;

    adder_out <= 0;

    end

    else if (clken)

    begin

    dataa_reg <= dataa;

    datab_reg <= datab;

    sload_reg <= sload;

    adder_out <= old_result + multa;

    end

    end

    endmodule


  • SengKok_L_Intel's avatar
    SengKok_L_Intel
    Icon for Regular Contributor rankRegular Contributor

    If further support is needed in this thread, please post a response within 15 days. After 15 days, this thread will be transitioned to community support. The community users will be able to help you with your follow-up questions.