hi ,everyone ,i have a question about the mulitplier. i am a newer in FPGA ,now , i wonder the different between the mulitplier constusted by * in HDL and the one generated by Ip core ,what is the most important different between these two kinds of multipliers? the speed or any others? thaks

I assume by IP core you mean "lpm_mult"? Using lpm_mult you have more control over the multiplier unit to be target. Typing '*' into your code you are at the mercy of what the synthesis engine picks for you. So for portable coding '*' is a better approach but when you are tuning your design for performance or area you might need to resort to using lpm_mult. I recommend using '*' and when that doesn't give you the results you are looking for then try replacing it with an instantiation of lpm_mult.

thanks ,for your suggestion. you mean that i can use the * for the primary design ,and if the synthesis result can't fit the target ,then i can usethe LPm_MULT instead for the second design to improve the function?

An example of switching to lpm_mult would be if you determine features of the DSP/embedded multiplier block are not being utilized when the multiplier is inferred when you were counting on it. Sometimes it is not possible for the synthesis engine to map all the features of the hardware multipliers so using lpm_mult gives you the ability to do this. In the Quartus II handbook there is a chapter called something like "HDL coding guidelines". It probably does a better job explaining this under the section about multiplication. I try to avoid using the LPMs whenever possible since I often create IP for different FPGA families and the hard block characteristics sometimes differ. So options of lpm_mult may vary between families which makes your implementation less portable as a result (you may not care about portability though).

This post is timely, because I have been having an issue related to it. For larger multipliers, lpm_mult creates logic that is much faster. In my case of a signed 32x32 multiply, lpm_mult is double the speed of using "*" in Verilog. For a reference, here is my code: module mult_test( input CLK, input signed IN_A, input signed IN_B, output signed OUT_C ); //Verilog version reg signed IN_A_d1; reg signed IN_B_d1; reg signed mult_result; assign mult_result = IN_A_d1 * IN_B_d1; always @(posedge CLK) begin IN_A_d1 <= IN_A; IN_B_d1 <= IN_B; OUT_C <= mult_result; end /* //Altera LPM Megafunction version //Created as signed 32x32 -> 64-bit multiply with 2 cycles of latency wire signed mult_result; assign OUT_C = mult_result; megafunction_mult megafunction_mult_inst ( .clock (CLK), .dataa (IN_A), .datab (IN_B), .result (mult_result) );*/ endmodule I get 90 MHz fmax with the SystemVerilog version, and 179 MHz with the lpm_mult version. I would rather use "*" for code portability, but the 50% speed cut is unbearable in my application.

I think it is to do with pipeline. You cannot pipeline internally with the inferred case as you did with lpm assuming a dedicated mult was generated in either case.

what is the different between * and IP core multiplier

12 Replies

Altera_Forum
Honored Contributor
14 years ago
I assume by IP core you mean "lpm_mult"?

Using lpm_mult you have more control over the multiplier unit to be target. Typing '*' into your code you are at the mercy of what the synthesis engine picks for you. So for portable coding '*' is a better approach but when you are tuning your design for performance or area you might need to resort to using lpm_mult. I recommend using '*' and when that doesn't give you the results you are looking for then try replacing it with an instantiation of lpm_mult.
Altera_Forum
Honored Contributor
14 years ago
thanks ,for your suggestion.
you mean that i can use the * for the primary design ,and if the synthesis result can't fit the target ,then i can usethe LPm_MULT instead for the second design to improve the function?
Altera_Forum
Honored Contributor
14 years ago
An example of switching to lpm_mult would be if you determine features of the DSP/embedded multiplier block are not being utilized when the multiplier is inferred when you were counting on it. Sometimes it is not possible for the synthesis engine to map all the features of the hardware multipliers so using lpm_mult gives you the ability to do this.

In the Quartus II handbook there is a chapter called something like "HDL coding guidelines". It probably does a better job explaining this under the section about multiplication.

I try to avoid using the LPMs whenever possible since I often create IP for different FPGA families and the hard block characteristics sometimes differ. So options of lpm_mult may vary between families which makes your implementation less portable as a result (you may not care about portability though).

Altera_Forum

Honored Contributor

14 years ago

This post is timely, because I have been having an issue related to it. For larger multipliers, lpm_mult creates logic that is much faster. In my case of a signed 32x32 multiply, lpm_mult is double the speed of using "*" in Verilog. For a reference, here is my code:


module mult_test(
	input						CLK,
	input	signed 	IN_A,
	input signed 	IN_B,
	output signed  OUT_C
);
	
	//Verilog version
	reg signed  IN_A_d1;
	reg signed  IN_B_d1;	
	reg signed  mult_result;
	
	assign mult_result = IN_A_d1 * IN_B_d1;
	
	always @(posedge CLK) begin
		IN_A_d1 <= IN_A;
		IN_B_d1 <= IN_B;
		OUT_C <= mult_result;
	end
	
	
	/*
	//Altera LPM Megafunction version
    //Created as signed 32x32 -> 64-bit multiply with 2 cycles of latency
	wire signed  mult_result;
	assign OUT_C = mult_result;
	
	megafunction_mult	megafunction_mult_inst (
		.clock (CLK),
		.dataa (IN_A),
		.datab (IN_B),
		.result (mult_result)
	);*/
endmodule

I get 90 MHz fmax with the SystemVerilog version, and 179 MHz with the lpm_mult version. I would rather use "*" for code portability, but the 50% speed cut is unbearable in my application.

Altera_Forum
Honored Contributor
14 years ago
I think it is to do with pipeline. You cannot pipeline internally with the inferred case as you did with lpm assuming a dedicated mult was generated in either case.
Altera_Forum
Honored Contributor
14 years ago
I should mention that I copied the code format from "Example 10–2. Verilog HDL Signed Multiplier with Input and Output Registers (Pipelining = 2)" in the Quartus II handbook.
Altera_Forum
Honored Contributor
14 years ago
To improve speed further, you better register io of mult block and also insert registers between the block and fabric.
Altera_Forum
Honored Contributor
14 years ago
code portability of slow problematic design is never a good idea and defeats its purpose... common sense ??
Altera_Forum
Honored Contributor
14 years ago
Try writing it the same way as shown in the multiplier template under the Edit menu. You can find it here under the templates: Verilog HDL --> Full Designs --> Arithmetic --> Multipliers --> Signed Multiply with Input and Output Registers.
Altera_Forum
Honored Contributor
14 years ago
--- Quote Start ---
Try writing it the same way as shown in the multiplier template under the Edit menu. You can find it here under the templates: Verilog HDL --> Full Designs --> Arithmetic --> Multipliers --> Signed Multiply with Input and Output Registers.
--- Quote End ---

I just tried that, and it gave me the same result as my own hand-written Verilog.

So, the current fmax summary:
lpm_mult with two cycle latency = 180 MHz
Verilog * operation with input and output registers = 90 MHz
Quartus II Verilog signed multiply with I/O registers template = 90 MHz

My .sdc file is setup to try for 200 MHz in every case.

Forum Discussion

what is the different between * and IP core multiplier

12 Replies

Recent Discussions

Cold Temperature Issue

SDM Monitoring & Thermal Behavior

Need Part EOL status(Active/Obsolete/Discontinued/NRND)

Agilex 7 JTAG Config Fails at 1% on Board #2 (Error 18950 / CONF_DONE Low) - But Board #1 Works

Agilex 3 VCCLSENSE and GNDSENSE