Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
12 years ago

Classic DMA core speed?

Hi everybody,

Please allow me one silly question. Let me explain small problem. I have one DMA inside Qsys, 1x 16bit PIO, SDRAM controller (16bit databus width). All peripherials are running at 100MHz. When I will force to transfer 100 dma transacations it takes 200 clock cycles. Is it correct? I was thinking when I am feeding DMA with 100MHz it will make 100 DMA transactions in 100 clock cycles.

Now the solution could be made by incresing the FCLK to 200MHz but I my SDRAM chips are 133MHz... I am little confused about the speeds and abilities of this configuration.

Could someone please make me short explanation?

Thank you,

Jan.

4 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Jan:

    The DMA core basically just allows you to initiate a transaction of a specific size to a specific location with minimal CPU overhead. The latency required for the transaction is dependent upon many factors, include arbitration cycles, ram paging etc. A 2 to 1 ratio sounds very reasonable.

    This is much faster than A CPU would be able to copy the bytes to the memory via instructions. If you truly need a 1 to 1 for a limited number of cycles, put it into a fifo or TCM (Tightly Coupled Memory).

    Pete
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Hi Jan:

    The DMA core basically just allows you to initiate a transaction of a specific size to a specific location with minimal CPU overhead. The latency required for the transaction is dependent upon many factors, include arbitration cycles, ram paging etc. A 2 to 1 ratio sounds very reasonable.

    This is much faster than A CPU would be able to copy the bytes to the memory via instructions. If you truly need a 1 to 1 for a limited number of cycles, put it into a fifo or TCM (Tightly Coupled Memory).

    Pete

    --- Quote End ---

    Thanks for the answer... When I was studying about TCM I have read that it is possibile to use only with the internal onchip slave... so I think that my external SDRAM is not capable of TCM.. Am I right?

    So only what I can do is to increase the Fclk to the top of the SDRAM possibilities to drag even more speed from DMA?

    Thanks for advice
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Jan:

    TCM is on-chip memory, so the size will be limited.

    Is your SDRAM just 16 bit wide as well, or is it wider?

    If you are going to say a 64 bit interface, 16 bits at a time, it's going to be more efficient to pack it into 64 bit words then use the DMA to transfer the 64 bits at a time.

    It's something to look into at least.

    One thing to consider when using an different clock for the SDRAM than for the rest of your logic, this will have a latency penalty as well, since this would require clock crossing logic to prevent meta-stability.

    Pete
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Thank you for the reply... Now it is clear.. I suppose that with my HW configuration I will use just the 2:1 ratio..

    Have a nice day.