Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
16 years ago

AltsynRAM : 1 clock cycle Read/Write performance

Hi all,

I need a 128 bytes RAM block, internal to FPGA (cycloneII). I would like achieve 1 clock read and 1 clock write performance. In Xilinx this is done with sync write and async read RAM. In ALTERA I need 2 clock cycles to read data. In first clock the address is registered and the data appears in the second clock cycle.

Is possible in ALTERA FPGA to read from the RAM in only 1 clock cycle?.

Thanks for your help!

10 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I disable output port registers, but address input port is registered. So, I need 2 clocks in a read cycle. In the first clock, the addres is registered and the data appears on the second clock.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I guess the available options on the RAM block depends on the FPGA family, and you may not be able to remove the input registers on some families.

    128 bytes isn't so much, you should be able to make your memory with registers instead of ALTSYNCRAM blocs, and in that case you can have it without registered inputs or outputs.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Something seems off. There are no asynchronous RAMs(MLABs are, but they're treates synchronously), but there is also nothing that requires two registers. You register the address, and the data comes out combinatorially later. There is no need for a second clock.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Rysc, then with ALTERA RAM I need 2 clock cycles for a read.

    In the 1st clock the address is registrered and afterthat (during the 2nd clock) the data comes out (combinatorially or not :)).

    I'm wrong?. It's possible read a data from a ALTERA RAM in ONLY 1 clock?.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I guess it's semantics, but if you drew the memory access as a schematic, there is only one register along the path, so that is generally referred to as one clock cycle of latency. You're basically saying you need the access in less than one clock cycle so it's completely out by the next clock cycle, or an asynchronous read. What device and what speed is the clock? If the clock is slow enough, then you could read on the falling edge of the clock, so it's available a half cycle earlier.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Semantics apart (I never talked about "cycle of latency", I talked about number of clock cycles=performance), my application is for a soft microprocessor, and the performance with Altera RAM I use (Cyclone II) is half of the obtained with Xilinx RAM.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I have always heard clock cycles referred to as the number of clocks in your datapath. (That way it always adds up, i.e. two components put together with clock cycles of 2 would have a total clock cycle delay of 4, but in your methodology it would only be 3, which is counterintuitive. That's why I was confused.) But yes, if the critical path in your design is solely dependent on an asynchronous read from memory, then Cyclone II's embedded memory is not good for your application.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Note that Stratix III/IV have a logic -> memory similar to the CLB. In general it is still treated as synchronous, but I think it can do asynchronous reads to(i.e. test it out before going down that road). I've converted a number of Xilinx designs where the user complained about not having asynchronous reads, only to find they had a register right next to it that could be absorbed, but it sounds like your architecture really depends on it. I also wonder if you could do some sort of "trick", like maybe having two copies of the RAM, where you ping-pong back and forth(I don't fully get what you're doing, so that may not make any sense).