Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
13 years ago

mSGDMA and On-chip Memory

Hi all!

I'm creating a design that uses a Nios II processor with on board memory and wish to be able to read & write to the on-chip memory from custom Verilog.

I did originally try using the Altera MM templates however as discussed by other users (search for 'Avalon MM templates in Qsys' on this forum), I experienced problems so have switched to using the mSGDMA template.

My question is what ports should I be exporting to use in the my custom Verilog?

Thanks, probably a simple question for someone!

18 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Even though this design is for Arria 10 SoC I would take a look at it to get a ballpark estimate of the performance you can expect from Cyclone V SoC: https://www.altera.com/support/support-resources/design-examples/soc/fpga-to-hps-bridges-design-example.html The design uses a baremetal program to control a bunch of mSGDMAs and pattern generator/checker cores in the FPGA to move data back and forth and measure the performance.

    In the documentation subdirectory you'll find an excel spreadsheet with the numbers collected. The data shown is for the FPGA operating at 250MHz with 128-bit ports into the HPS and HPS SDRAM ports. In Cyclone V SoC if you have unidirectional data then what you could do is gang all the F2S ports together into a single 256-bit data path and move bulk data through it.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    Even though this design is for Arria 10 SoC I would take a look at it to get a ballpark estimate of the performance you can expect from Cyclone V SoC: https://www.altera.com/support/support-resources/design-examples/soc/fpga-to-hps-bridges-design-example.html The design uses a baremetal program to control a bunch of mSGDMAs and pattern generator/checker cores in the FPGA to move data back and forth and measure the performance.

    In the documentation subdirectory you'll find an excel spreadsheet with the numbers collected. The data shown is for the FPGA operating at 250MHz with 128-bit ports into the HPS and HPS SDRAM ports. In Cyclone V SoC if you have unidirectional data then what you could do is gang all the F2S ports together into a single 256-bit data path and move bulk data through it.

    --- Quote End ---

    Hello BadOmen,

    Many thanks for reply. I am using bidirectional data so could you please let me know another method?

    Your help would make my task easy :)
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I think I need more information about what problem you are trying to solve since all the buses between the FPGA and HPS are bidirectional. With Avalon-MM connectivity you can only issue a read or write at any given time but AXI allows simultaneous reads and writes.

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi BadOmen,

    I am quite new with Quartus.

    So basically, I used Qsys to generate my hps component ans my system. I created 2 slave component(read and write) ( avalon memory mapped slave) and i can through the lightweight HPS-to-FPGA bridge map the address of the first slave (write) and write data. Also with the second one i can read data from it by the hps.

    The problem is that the lightweight HPS-to-FPGA bridge has a capacity of 32 bits. So now, I would like to use the HPS-to-FPGA bridge which has a capacity of 128 bits.

    I did the same thing that i did for the lightweight. but it doesn't work. I don't know why??

    What is the difference between those two bridges? is there something to add (configuration)?

    Could you please help me out from it?? It will be your great help!!
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    The lightweight bridge is 32-bit because it's mostly meant for controlling IP (accessing control and status registers) whereas the H2F bridge is intended for higher throughput memory transfer operations. That said if you are looking to maximize the performance having the FPGA move data in/out of the HPS through the FPGA-to-SDRAM interface is going to be the fastest method for bulk data.

    In order for either of the H2F bridges to operate they need to be mapped into the address space (registers in the system manager control this) as well as they have to be receiving an active clock and pulled out of reset. The security of the bridge slave ports also needs to be set accordingly, by default the entire system is secure so if you have been dividing the system in secure and non-secure regions it could be security getting into the way.

    That's about all I can tell you without knowing how it fails. Do you get a memory access error if you attempt to access the FPGA? Does the system crash when you attempt to access the FPGA? Does the MPU lock up when you attempt to access the FPGA, and if so have you checked to see if the transaction reaches the FPGA using signaltap?
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi BadOmen,

    Yes I am using h2f axi bridge to maximize the performance.

    I am attaching screenshot of my qsys component. No I am not getting any error, system is also not crashing at all but I am not getting required output which is just turn ON and OFF LED.

    Thanks in advance.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    --- Quote Start ---

    The lightweight bridge is 32-bit because it's mostly meant for controlling IP (accessing control and status registers) whereas the H2F bridge is intended for higher throughput memory transfer operations. That said if you are looking to maximize the performance having the FPGA move data in/out of the HPS through the FPGA-to-SDRAM interface is going to be the fastest method for bulk data.

    In order for either of the H2F bridges to operate they need to be mapped into the address space (registers in the system manager control this) as well as they have to be receiving an active clock and pulled out of reset. The security of the bridge slave ports also needs to be set accordingly, by default the entire system is secure so if you have been dividing the system in secure and non-secure regions it could be security getting into the way.

    That's about all I can tell you without knowing how it fails. Do you get a memory access error if you attempt to access the FPGA? Does the system crash when you attempt to access the FPGA? Does the MPU lock up when you attempt to access the FPGA, and if so have you checked to see if the transaction reaches the FPGA using signaltap?

    --- Quote End ---

    Hi BadOmen,

    Thanks for your reply. Now,I am able to use H2F AXI bridge; which ofcourse increase throughput. Now, my second point is to decrease latency so could you please let me know which method should I use?

    For calling component I am using dev/mem method which is calling my component so as per my thoughts which is leading to increase latency. so could you please let me know other method.

    Many thanks in advance :)
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Unfortunately there will not be much you can do to reduce the latency from the HPS into the FPGA from a hardware perspective.

    I suspect what you need is a kernel driver that talks to your hardware directly because I think dev/mem maintains a copy of the data and moves it to/from the destination which is adding an additional copy operation. Keep in mind I'm a hardware engineer so I could be completely wrong. Your driver would mmap the region and provide APIs for accessing the hardware. I would search around for online material about how to write a Linux device driver because this information isn't Altera SoC specific and there is a lot of material on the web about this. You might find quite a bit of information on rocketboards about this as well, for example this: https://rocketboards.org/foswiki/view/documentation/ws3developingdriversforalterasoclinux