Forum Discussion

pstvrv's avatar
pstvrv
Icon for New Contributor rankNew Contributor
12 days ago

tennm_mac in m18x18_sumof2 mode on Agilex 7 - spurious systolic register inserted in Quartus 25.3

Device: Intel Agilex 7
Quartus version:  Quartus Prime Version 25.3.0 Build 109 09/24/2025 SC Pro Edition
Atom: tennm_mac
Mode: m18x18_sumof2 with pre-adder (operand_source_may = "preadder")

Description

I am directly instantiating tennm_mac atoms in m18x18_sumof2 mode for a high-throughput multiplications. The design uses the pre-adder on both multipliers (az/bz ports) and has all four pipeline stages enabled (ax_clken, input_pipeline_clken, second_pipeline_clken, output_clken all set to "0").

I observe incorrect result at the output of the multiplier. Inspecting the Resource Property Viewer confirms that systolic registers have been placed on the A* path inside the DSP block, even though:

  1. input_systolic_clken is explicitly set to "no_reg"
  2. Systolic mode is not selected — the operation mode is m18x18_sumof2
  3. The systolic register is architecturally present only on the A* datapath; the B* datapath has no equivalent register, so the skew is asymmetric by design.

The result is incorrect sum of multiplies output on hardware - the coefficient and data samples presented to the multiplier are misaligned by one clock cycle.

Minimal defparam set that reproduces the issue:

tennm_mac u_mac (
.ax(ax), .ay(ay), .az(az),
.bx(bx), .by(by), .bz(bz),
.clk(clk), .ena(3'b111), .clr(2'b00),
.resulta(result)
);

defparam
u_mac.operation_mode = "m18x18_sumof2",
u_mac.operand_source_may = "preadder",
u_mac.operand_source_mby = "preadder",
u_mac.ax_clken = "0",
u_mac.bx_clken = "0",
u_mac.ay_scan_in_clken = "0",
u_mac.by_clken = "0",
u_mac.az_clken = "0",
u_mac.bz_clken = "0",
u_mac.input_pipeline_clken = "0",
u_mac.second_pipeline_clken = "0",
u_mac.output_clken = "0",
u_mac.input_systolic_clken = "no_reg", // <-- ignored
u_mac.clear_type = "none";

Questions

  1. Are my assumptions that systolic registers are placed incorrectly correct?
  2. Is this a known issue in Quartus 25.3 / with Agilex 7 devices? Similar design compiled correctly on Quartus 17.1 for Arria 10.

Any guidance to resolve this issue would be appreciated.

3 Replies

  • CheepinC_altera's avatar
    CheepinC_altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

     

    From your description, it appears that you are directly instantiating the atoms in your design. I’d like to check whether you have had the opportunity to try instantiating the DSP primitive IP ie the Native Fixed Point DSP IP for Agilex FPGAs, which is generally the recommended approach, to see if it works?

     

    Could you also share if there is a particular reason you are using the low‑level atom instead of the DSP primitive IP? Understanding your reasoning will help us better support your use case.

     

    Thank you.

  • CheepinC_altera's avatar
    CheepinC_altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

     

    Thank you for filing this case and sharing the details. I appreciate your patience. Please allow me some time to review the information, and I’ll get back to you as soon as possible.