Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
11 years ago

ALTSHIFT_TAPS and timing issue

I am having some problems going from simulation to synthesis in my design. The simulation works fine while the synthesized and fitted design does not.

I have written a test bench to synthesize a triangle wave on an audio converter. This works fine.

I have written a decimating filter in Verilog that lowers the sample rate by a factor of 8. The decimating filter has 96 taps and I have 18 bit coefficients and 24 bit data feeding into multipliers.

The coefficients are constant and the 24 bit data feeds in from an ALTSHIFT_TAPS IP block.

As 96 taps sucks up a lot of multipliers, I have a 32 value multiplier.

When the data loads I calculate the first 32 taps, on the next clock I calculate taps 33 to 64 and on the next tap I calculate taps 65-96 (then I shift ALTSHIFT_TAPS three times to set up the data for the next load, while shifting in the current data).

I have created a monitor to detect the rising and falling edges of the ramp and this works well in simulation, on the output of the decimator, however the synthesized and fitted design shows a lot of glitches in signal tap.

The timing analysis shows some horrendous timing issues between the 98.304MHz clock that is generated by a PLL, and some of the decimator registers.

Is there anyway of improving this?

I am using Cyclone IV GX and Quartus 14.0.

Are there any design examples for DSP and ALTSHIFT_TAPS?

9 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    if you have 96 taps for a decimate by 8 filter then you need 96/8 = 12 multipliers only.

    If your system clock is faster than data input rate by say 2 then you can opt to use 6 multipliers only.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I am decimating the sampling rate by 8. Sorry.

    So I need to perform 96 calculations per sample but the output of the decimating filter is read only on one sample in eight.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    I am decimating the sampling rate by 8. Sorry.

    So I need to perform 96 calculations per sample but the output of the decimating filter is read only on one sample in eight.

    --- Quote End ---

    you only need 12 multipliers.

    Decimation by 8 means you do not need to output but once every 8 and so you have 7 more left time slots (clock ticks) to use same 12 multipliers the add up the results (accumulate) ready for nest output.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Ah yes. I sse. As long as I keep track of the values at the 8 time slots I can accumulate and subtract the relevant value from the previous set.

    That makes life easier. Thank you!!!!!!
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Ah yes. I sse. As long as I keep track of the values at the 8 time slots I can accumulate and subtract the relevant value from the previous set.

    That makes life easier. Thank you!!!!!!

    --- Quote End ---

    If you want easier life then you can use 6 multipliers only (pre add then multiply by a symmetrical coeff)
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Yes. The (software) DSP guy has just explained that to me. The 12 muliplier decimator is working great thank you!!!!

    I see I can apply a similar but easier method to interpolation just by ordering the coefficients correctly.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Yes. The (software) DSP guy has just explained that to me. The 12 muliplier decimator is working great thank you!!!!

    I see I can apply a similar but easier method to interpolation just by ordering the coefficients correctly.

    --- Quote End ---

    pre add increases mult inputs by one bit and may lead to extra resource if mults don't support the increased bitwidth.

    With interpolation there are no spare clock ticks but there are alternating zeros in input (physically or assumed) and so you arrange coeffs as polyphases by skipping prototype filter regularly. Pre-add may fail here due to loss of symmetry except for some values of interpolation.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The same problem reared its head in the 12 multiplier version. A silly mistake on my behalf.

    I took the 12 multiplier outputs and wrote

    assign res = prod1 + prod2 + prod3 + prod4 + prod5 + prod6 + prod7 + prod8 + prod9 + prod10 + prod11 + prod12;

    This create a silly combinational path.

    I used PARALLEL_ADD instead and Voila!
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I think ans. is 12. if this is wrong please tell me the true ans.:D:D