Forum Discussion
Altera_Forum
Honored Contributor
13 years agoAbout right - the PCIe does allow multiple outstanding reads, but a processor is unlikely to generate them for normal memeory accesses.
It is worth noting that PCIe models a 64bit data bus, on the fpga a 32bit request end up going through a bus width adapter and generating two 32bit cycles - one of which has no asserted byte enables. So making the PCIe side of the fifo 64bit will remove a couple of clocks. If your linux host is an x86, you might find that SSE2 transfers generate a single TLP for 128 bits (and AVL ones for 256 bits - if supported by the cpu and the linux version you are using). It is also possible that unrolling the loop might generate multiple concurrent read TLPs. Neither qsys nor sopc seem to let you alias addresses. I think it can be done by feeding all the Avalon signals through a 'conduit' and replacing the relevant address bits with zero. Being able to alias internal memory would be useful for software cyclic buffers, also 9, 18, or 27 bit wide memory might also be useful for some uses (eg lookup tables). I (mostly) do software, I know the PCIe stuff caused a certain amount of grief though. There are some random, splurious, constraints about the way PCIe slave windows get asssigned to Avalon addresses. I can do reads at about 21ns/byte (using large TLP and overlapped requests, with a system call overhead for each transfer). That still isn't 1GBit/s. That is from a small ppc, NFI how to generate long TLP from any x86 cpu.