Forum Discussion
Altera_Forum
Honored Contributor
11 years ago --- Quote Start --- I've successfully implemented the mSGDMA controler. It's quite simple if you know how to do it http://www.alteraforum.com/forum//images/icons/icon7.png If you need DMA to transfer data between FPGA --> HPS SDRAM or HPS SDRAM --> FPGA the mSGDMA module is much better than built-in DMA controler. System description: 1) mSGDMA module connected to F2S ( FPGA --> SDRAM bridge) , not to the F2H AXI. 2) mSGDMA setup: Memory -> Stream or Stream -> Memory Data width = 64 bits Data FIFO Depth =64 Desc FIFO Depth =64 Transfer Length = 1kB or 16 kB Burst enable , Max. Burst Count = 16 Measurements result: mSGDMA Transfer Length = 1kByte : 1024 packets * 1kB , Memory -> Stream , Throughput = approx. 290 MBytes/s (good !!!) 1024 packets * 1kB , Stream -> Memory , Throughput = approx. 290 MBytes/s (good !!!) mSGDMA Transfer Length = 16kByte : 64 packets * 16kB , Memory -> Stream , Throughput = approx. 378 MBytes/s (good !!!) 64 packets * 16kB , Stream -> Memory , Throughput = approx. 378 MBytes/s (good !!!) For built-in DMA I've got following results : 1 packet of 1 MBytes , Memory -> Memory , Throughput = approx. 423 MBytes/s ( very good !!!) 1 packet of 1 MBytes , Memory -> Register , Throughput = approx. 38.4 MBytes/s ( poor !!!) 1024 packets of 1kByte , Memory -> Memory , Throughput = approx. 16.1 MBytes/s ( very poor !!!) 1024 packets of 1kByte , Memory -> Register , Throughput = approx. 11.7 MBytes/s ( very poor !!!) The last results due to large delay between consecutive DMA transactions !!!! Important notices about mSGDMA and F2S implementation ( poor documented) : 1) you must to service the signals :msgdma_valid and msgdma_ready on the FPGA side (i.e. connect these signals together !!!!) 2) you must to define ALT_BRIDGE_PROVISION_F2S_SUPPORT =1 in your Makefile (read in HWLib help) 3) you must to link assembly file like this : alt_bridge_f2s_gnu.s in your Makefile (read in HWLib help) 4) you must to initialize F2S bridge in your program !!!! i.e. alt_bridge_init (ALT_BRIDGE_F2S, NULL , NULL) Regards -jaro --- Quote End --- Thanks for your benchmarks. One of the shortcomings of the alt_dma_*_to_*() APIs is that it attempts to reassemble the entire DMA program whever that API is called. If you were running the transfers on the same addresses, you can call the alt_dma_*_to_*() for the first transfer, keep the ALT_DMA_PROGRAM_t program buffer, then call alt_dma_channel_exec() repeatedly with that buffer. This would greatly reduce the time needed to reassemble (essentially) the same transfer. Clearly if your transfer addresses changes, some adjustments need to be made. Other tricks to improve performance when you do need to make address adjustments is to update the DAR (destination address register) or SAR (source address register) using alt_dma_program_update_reg() API, however there can be many caveats. The alignments of the SAR and DAR needs to be mod 8 "congruent" to the original addresses. And if you have caching enabled it adds another level of complexity and may not work. Just test your use case before fully relying on this method :eek:. fdh