--- Quote Start ---
1) It is Cyclone IV GX speed grade 7, I modified the design in
http://www.alterawiki.com/wiki/pci_express_in_qsys_example_designs changed the DDR2 to our 128 M 16 bit DDR2 SDRAM.
--- Quote End ---
Ok.
--- Quote Start ---
2) The memory is to be used for storing some parameters/commands for signal processing
--- Quote End ---
Why use the DDR memory for this? It would make more sense to me to use on-chip memory for parameters that the DSP logic will be using. Of course, that assumes there are a few parameters.
--- Quote Start ---
3) There is a lot of RF data from ADCs for the FPGA to process
--- Quote End ---
That data should be going directly into the DSP logic. What actually needs to be stored in the DDR? A power spectrum? A cross-correlation?
--- Quote Start ---
CPU needs to send commands/parameters
--- Quote End ---
Again, this likely should go to on-chip RAM and registers.
--- Quote Start ---
also obtains processed data storage in DDR2 SDRAM from PCIe. I don't know if CPU needs to see that memory or not, if it does not need to see that memory, how to realize ? to get the data from DDR2 SDRAM?
--- Quote End ---
You will never want to use the host CPU to transfer anything but a few simple parameters. The performance of a CPU issuing a write or read command to a PCIe device is slow. Its fine for setting up a few registers, or initializing a DMA controller, but its ultimately just slow.
You need to talk to a device driver developer. They will explain that devices that transfer data, eg., network cards, video cards, data processing cards, do not use the CPU for moving the data, they use a DMA controller on each of those respective cards.
You should design the hardware to match the requirements of the device driver developer.
--- Quote Start ---
4) FPGA processes the ADC data,CPU sends commands/parameters, access data in on-chip memory or in DDR2 SDRAM. displays processed data for real-time (30frams/s) images on screen.
--- Quote End ---
30 frames per second of what? An HD image, or a 1024-point power spectrum? This data defines your sustained data rate from your board to the CPU. Calculate it. This is your design target!!!
If your data rate is low, perhaps a simple CPU-based read of that data will be sufficient. However, in most applications it would not be, or it would be a waste of CPU time, and DMA will be your only option. If you do need DMA, then you can discuss with your device driver developer whether or not the Altera DMA controller has sufficient functionality for your requirements.
For example, in the Qsys PCIe example I sent a link to, the PCIe bridge can be configured with a 1MB outgoing translation window. If your device driver developer can guarantee that the host data used for DMA is located in a single 1MB region, then you can just use that DMA controller directly.
Cheers,
Dave