How to effectively implement PCIe transfer?
Hi,
I am using Stratix10 series PCIe IP core for the first time and while reading "L-tile and H-tile Avalon Memory mapped Intel FPGA IP for PCI Express User Guide 21.1 ( 8.1. Read DMA Example )", I found that the following steps need to be performed to achieve the transfer from FPGA to software:
- Software allocates memory for Write Descriptor Status table and Write Descriptor Controller table in host memory.
- Program the Write Descriptor Controller table.
- Program the Write Descriptor Controller register "Write Status and Descriptor Base" with the starting address of the descriptor table.
- Program the Write Descriptor Controller "Write Descriptor FIFO Base" with the starting address of the on-chip write descriptor table FIFO.
- Program the Write Descriptor Controller register WR_DMA_LAST_PTR with the value.
- The host waits for the MSI interrupt. The Write Descriptor Controllers send MSI to the host after completing the last descriptor.
Based on my understanding, the software first allocates memory for the Write Descriptor Status table and Write Descriptor. Then, the descriptor table is programmed to determine the transfer. The PCIe IP core's FIFO in the FPGA retrieves the descriptors from the software memory and performs the transfer according to the descriptors. Finally, an MSI interrupt is sent to the software to indicate the completion of the transfer, and the lowest bit of the Descriptor Status is set to 1.
If I need to continuously transfer image or video data from DDR to the CPU through the FPGA, do I only need to use one or a few descriptors to complete the transfer? After the transfer determined by the descriptors is completed, the software rewrites the descriptors and repeats the previous operations. If so, I think the transfer will be intermittent, as the FPGA needs to wait for the software to program the descriptors before initiating the next transfer. Is there a better way to implement data transfer?
Hi Allen,
Thank you for reaching out.
From the steps given, I believe that you are looking into 8.2. Write DMA Example instead of 8.1. The two questions you've raised are:
1. Do I only need to use one or a few descriptors to complete the transfer?
2. Is there a better way to implement data transfer?
To address your questions, the findings below are according to the user guide 21.1 as mentioned.
1. To use one or a few descriptors to complete the transfer, it is up to the transfer size that you wish to transfer. In PCIe system memory, the read and write descriptors are stored in separate descriptor tables. And each table can store up to 128 descriptors. Each descriptor is 8DW / 32 bytes. Based on the descriptor format, the maximum transfer size is (1 MB - 4 bytes). Hence, if you need to transfer large amounts of data, which is more than 4 bytes, you'll need to use a few descriptors (more than one) to complete the transfer. On the other hand, if the data is less than 4 bytes, then one descriptor will be enough to complete the transfer. However, take note that to avoid a possible overflow condition, allocate the memory needed for the number of descriptors supported by WR_TABLE_SIZE.
2. No. This is the way it works for DMA data transfer, as the descriptor ID will loop back to 0 after reaching WR_TABLE_SIZE.
If you want to process more pointers than the WR_TABLE_SIZE, there are two steps that you must follow:
1. Process the pointers up to WR_TABLE_SIZE by writing the same value as in WR_TABLE_SIZE.
2. Next, write the number of remaining descriptors to WR_DMA_LAST_PTR.
User Guide: https://www.intel.com/content/www/us/en/docs/programmable/683667/21-1/introduction.html
I hope these address your questions well.
Thanks.
Best Regards,
VenTing_Intel