Forum Discussion
Altera_Forum
Honored Contributor
13 years agoOkay, then …
A device-to-device transaction is in no way different from a device-to-main-memory transaction, it’s just the preparation that is different. Say, you want to transfer data from FPGA1 to FPGA2.- First you have to get hold of the FPGA2 PCI address assigned to its target BAR, say, BAR2. The BARs are typically visible to the driver of FPGA2 (only), but it has to be transported to the driver of FPGA1.
- This communication can be set up in user space or in kernel space, depending on the actual needs.
- As a result, driver1 knows the destination memory region BAR2 of FPGA2, and it communicates it to FPGA1, say, via MMIO register accesses. It could also setup a descriptor-like table in main memory or even on the FPGA2 device.
- A different approach would be to indicate the PCI ID of FPGA2 to FPGA2, and let the two devices exchange their BAR addresses via PCI Message TLPs with Device Routing. Once assigned by the driver, the BARs can be fetched from the Hard IP Configuration space.
- FPGA1 can now write data – even as big chunks – to FPGA2 by using the indicated addresses. This is done by standard Write TLPs, the same ones as used when writing to main memory. You can and probably should use transfer acceleration features like Relaxed Ordering for data transfers, and strict ordering for ‘commit’ messages, just like you do when writing data to main memory.
- BAR2 of FPGA2 might be cachable, so be prepared for write transaction combining and collapsing, done by the IDT Switch. Send your ‘commit’ messages to a non-cachable BAR (like BAR0) of FPGA2 if this could harm the communication handshake.
- FPGA2 now receives large Write TLPs – something that doesn’t happen when only the CPU issues 4 to 8 bytes maximum requests.
- FPGA2 might want to respond in some way, semantically – remember, write TLPs are not followed by completions – so you need a custom method. If it is just some kind of simple Acknowledgment, sending it to the driver and let the user code handle this data direction might be sufficient. In the other case where both directions should carry significant amounts of bandwidth, a BAR on FPGA1 might be indicated to FPGA2 as well. This BAR address exchange could be done in the same way as described above, using either driver-level or device-level setup.
- FPGA1 could even issue read requests to FPGA2 which request large amounts of data, probably requiring multiple completions to do, so the design might need a change, compared to simple one-word CPU requests, for this to work reliably under all load conditions.
- Remember that PCI devices may come and go. Try maintaining good communication with the driver regarding the other FPGA’s presence state. Be prepared for integrating timeout mechanisms.