Yeah, with the streaming interface you have to deal at the TLP level. We're currently using the Avalon-MM flavor of the hard IP block (gen3x4 endpoint) in an Arria 10 GX dev kit plugged into an Intel motherboard running Linux. The Avalon-MM interface relieves you of the TLP details but we're finding that the latency of the MM interface is costing us throughput. We have the endpoint mastering the bus to stream images from a high-speed image sensor into system memory on the PC. We will likely end up going to the streaming interface to try to eliminate a couple dead cycles per burst on the MM interface. Still have a ton to learn about PCIe. We have a PCIe bus analyzer coming in a week or two and that should be very interesting to play with.