Forum Discussion
Altera_Forum
Honored Contributor
12 years agoPCIe is kind of a bad example to work from for this topic, because there are multiple subsystems (everything within the FPGA vs. everything on the other side of the PCIe pins; NIOS software and DMA controllers vs. (let's call it a PC) CPU and DMA controllers). All of those entities can read/write the "onchip_memory" component.
Why you would want to use one method vs. another depends on the specific task you are using that hardware design to perform. Kind of a typical use case would be having the FPGA transform or otherwise process some piece of data which originally is sitting inside the host CPU memory. You need to copy the data into the FPGA onchip_memory, command the operation to execute, and then retrieve the result. Getting back to why all the different methods, depends upon performance: - you can tie up the host CPU / DMA controllers and perform all of the memory transfers via the BAR1_0 Avalon-MM Master port. This is "entry level" complexity and frowned upon for more demanding applications. You typically would rather leave the host CPU and DMA free for other tasks, if possible. - you can have the host CPU command the Qsys "dma" DMA controller to process the memory transfers. The data would be transferred via the "txs" port shown in your diagram, and the FPGA would become the master of the data transfer in the PC. This arrangement is "better" than the previous one, since the host CPU is not tied up during the memory transfers, and simply has to control the "dma" peripheral. Finally, the most complicated / highest performance arrangement would be for the external CPU to command the NIOS software to autonomously manage the entire task. The NIOS would possibly then setup a number of descriptors for the SGDMA to continuously stream inputs / outputs to the host CPU memory. The control activity on the PC would maybe consists of issuing a "go" / "stop" command to the NIOS and maybe fielding interrupt notifications that the operations are completed. Anyway, there is possibly an infinite number of arrangements / applications that you can think up. But getting back to my original reply, it all boils down to performance / resources you want to dedicate to the task. This reference design you are looking at is fairly flexible, and allows for a single FPGA compilation to be used for an extended period of time while learning the software development ins and outs of PCIe systems. You can start with x86 peek/poke to the onchip memory, buffer transfers, etc. and then end with mailbox IPC to the NIOS with otherwise autonomous DMA between memories.