Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
16 years ago

Shift Register (RAM based) megafunction used too much M4K blocks !?

I used Shift Register (RAM based) megafucntion to generate a shift register with 128 taps and the distance between taps is 4 and the data bus is 12 bits. The Mega-Wizard showed this shift register need 43 M4K.

According to datasheet , every M4K has 4609 bits and can be configured as 256x16 ,so 2 M4K blocks should be enough to fulfill the shift register mentioned above . Why does it need 43 M4K ?

Is there any method to cut down the M4K resource usage of this shift register?

Thanks!

3 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    RAM can be accessed only one address at a time. Altera FPGA has dual-port RAM, but one port is required as input port. So each shift register output is requiring an individual RAM block data line. Without considering all details, I assume a 36 x 128 configuration split into 3, so one M4K block serves three 12 bit outputs. 43 blocks are required for 128 outputs then.

    To reduce the RAM block amount, the RAM must be time multiplexed, if applicable. I don't see, that the RAM-based shiftregister MegaFunction is supporting this option, but you can design an application specific solution.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The RAMs run into restrictions based on number of bits or number of ports, and you're clearly running into the latter. As FvM stated, if one address ports is used to write in new data, the other address port can read out, at most, 36 bits at a time. So each M4K is really only 4 bits deep wide(the way the taps work is some of the outputs are fed back into the write side, just shifted over a word.) Time multiplexing will help increase that, but you'll still use a lot of bits. If you can 2x time multiplex it, then you'll need ~24 blocks. (And if running with a clock at 2x the rate, it might be simpler to say the taps are 8 spaces apart, since you have bits to spare and it works correctly in the main clock domain). Another option is to use smaller RAMs, either M512s or MLABs, depending on the family, which have a higher "# of ports per bits" ratio.

    If feeding into a DSP block, it might also be worth checking if there's something in that architecture(like a shift on the inputs) that will do what you want.(I know there's a shifter, but it won't store values between shifts, but there might be a better way around this...)
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Thank you for your explanation! But it is still a bit hard to understand thoroughly.