Hi Mulligan.
1. On the Quartus 14.1 I have no experience ... it may work better to load 13.1 if that works better with the examples you have or contact Altera or post in t forums section for tools.
2. I checked S5-PCIe-HQ and it looks like quite a nice card since it supports a variety of Stratix devices and runs up to Gen3 speed.
3. On the basics of how to communicate host -> endpoint and endpoint -> host memory, let me point you to a .pdf that I believe does a fairly good job of describing both directions. I have attached images of the figures thatI found useful.
Seatch for this .pdf "IP Compiler for PCI Express User Guide" by Altera. The attached images are from that document.
Ideally, start with an example design and expand from there.
In summary, to transfer data from your host system ( RC ) to the endpoint, your device driver will first discover the endpoint, assign system memory to the endpoint when the device driver is installed. At this point, memory writes / reads to the assigned address regions will be claimed by the endpoint BAR comparitors and delivered to some slave device behind the endpoint PCIe core.
Conversly, for the endpoint card to DMA data to the host system, there is a translation mechanism that translates an internal address to a 32 or 64 bit PCIe address and the write or read transaction ends up at the host system as a memory write or read. Normally the device driver would manage the DMA buffer allocation in the host system ( Linux ) where the physical bufffer address is known by the endpoint card for the DMA and the corresponding virtual address is known by the process running Linux.
On the subject of an endpoint communicating with another PCIe endpoint, that would involve a PCIe switch , and I am not familiar with systems ( with PCIe slots ) that support endpoint to endpoint connections via a PCIe switch on the host system board ... however they may exist.
Best Regards, Bob.