Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
14 years ago

SGDMA data transfer question.

Hi,

I did a memory transfer test from RAM to SDRAM using SGDMA, the speed test was evaluated using performance counter core, and it shows the transfer rate goes up to 370 MB/s with all content copied correctly, which is too good to be true!

when i used the same code on RAM to SRAM, the transfer rate only goes up to 50MB/s !!!

WHY? i thought SRAM should be faster than SDRAM? i was testing my code in DE2-115. Maybe the SRAM is slower than SDRAM in DE2-115 ??

PS: Both cases were running at 100MHz.(same clock speed with CPU)

Rregards,

Michael

13 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    On chip memory ought to be able to read/write a location every clock - although the read data isn't available until the following clock cycle. That certainly happens when for the Nios 'tightly coupled data memory'. It would require Avalon burst accesses.

    Were you measuring SDRAM reads or writes? they will behave differently.

    My experiments suggest that writes are 'posted' (ie acked immediately) unless the logic is busy. The first write will then be actioned, subsequent writes are held in a 'line buffer' (probably 32 bytes, maybe 64) provided they address adjacent locations. When the underlying write completes, the contents (if any) of the line buffer are written out. So writes only stall if they address a location that cann't be buffered.

    Reads will read an entire line buffer - then return the requested location. Further reads for nearby addresses return data from the buffer. Fully random reads are about 16 clocks.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Bursting can potentially have two negative effects:

    1) In SOPC Builder burst adaptation was inefficient, especially with small burst sizes. The burst adapter would always have a single dead cycle at the start of a burst so if you used a burst count of 2 you are talking at best 3 clock cycles per burst. Qsys doesn't have this limitation, it's just an SOPC Builder thing.

    2) With DMAs (not sure about the one you are using) the DMA engine typically waits for a full burst to be read and buffered before issuing out a burst write. The bigger the burst length the more time the write master has to wait before issuing the burst.

    Assuming you are using a modern SDRAM controller from Altera, you can change the local burst length of the slave port. I typically turn that down to 1 and let the controller combine sequential accesses into a single offchip burst. Then you don't need to worry about enabling bursting in the master logic or any burst adapters that might be created as a result.