Forum Discussion

Altera_Forum's avatar
Altera_Forum
Icon for Honored Contributor rankHonored Contributor
12 years ago

Need Advice - First time PCIe device development + QSYS

I appreciate any advice/reading materials you can suggest to clarify my issues.

I am developing a StratixV-based PCIe card. The card was developed by another company, but they have provided all of the specs for the external DDR interfaces and PCIe interfaces. The card has a PCIe 3.0 compatible interface and two external DDR3 banks.

I am trying to use QSYS to instantiate the PCIe hard IP core for Stratix V, and have some questions.

q1:

For the PCIe core instance I am using the Avalon-MM interface. I configured two BARs, 0 and 1, as 32-bit non-prefetchable memory. BAR0 I am reserving for the control and status register (CSR) hookups. BAR1 I am using for the scatter-gather DMA controller descriptor table memory. Is this a good design choice?

Is there a typical design pattern for choosing how to assign each BAR?

q2:

I originally started my design using the Avalon-ST Stratix V HIP PCIe core, but ran into some issues with the Stream-Memory SGDMA controller IP. The symbol size for the PCIe output was 128-bits, while the SGDMA was a single byte. I am assuming in streaming mode the ST interface is dumping the entire TLP packet on the streaming interface. In that case, I was thinking of doing a DC FIFO -> packet to byte IP -> SGDMA controller. Was I thinking along the correct lines?

q3:

I am actually thinking this is more of a bug with QSYS 13.0sp1. I was generated my synthesis files as VHDL, and during compilation there was an error from the altera_pcie_sv_hip_avmm.vhd file that was complaining about the direction of tlbfm_out. The VHDL component declaration used an output, while the underlying verilog entity used that signal as an input. I am assuming this is either a bug, or I some how connected a signal incorrectly.

Thank you for your help.

9 Replies

  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Sorry for the double post. One more question:

    For the scatter-gather DMA controller in QSYS, it appears the address bus for the m_read/m_write ports are limited to 32-bits, although in the documentation it says it is capable of 64-bit addressing in the descriptors.

    I would like to map the DMA controller to 2 8GB memories and the PCIe core TXS port, but I am running into an issue with the address decoding (since the decoding requires > 32 bits). What am I doing wrong? Do I need to add Avalon-MM multiplexer and select between banks ?

    Thank you.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    I wrote up my impressions of the Qsys PCIe interface in this thread:

    http://www.alteraforum.com/forum/showthread.php?t=35678

    The Qsys component is not very useful. A 32-bit Avalon-MM target is used to generate PCIe transactions. A 64-bit PCIe address is constructed using the 32-bit Avalon-MM address as an offset, and the 32-bit MSBs are filled in by a register setting. I feel this is a very limited solution, given that host memory pages could be located anywhere in 64-bit PCIe space.

    If you want a "real" PCIe initiator implementation in the FPGA, you will either need to implement your own Qsys interface, or look around and see if there is a third-party IP core that supports what you want. If you do look at third-party IP, make sure they have a decent PCIe BFM so that you can create testbenches.

    This is just my impression ... others can comment if they've had a better experience with Altera's PCIe offerings.

    Cheers,

    Dave
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Evgeni,

    Just to confirm my understanding, since you've been using this core. In your address translation settings, you've got a 24-bit address (16MB ), so you can DMA from your Qsys 32-bit address space to a 16MB window in PCIe space. The MSBs of the 64-bit address are set by another register in the Qsys PCIe component.

    How do you deal with the "real world" of host memory addresses that have arbitrary 64-bit addresses?

    In the University Program PCIe examples I looked at, they "cheated" by having a 2GB window, and then using a memory allocation scheme on the host PC, where memory pages were restricted to lie below 2GB on the host (or at least within a fixed 2GB window).

    A "real" PCIe master/initiator scatter-gather DMA controller should be able to generate an arbitrary 64-bit address for each transaction in the scatter-gather DMA list. You can create such an interface with Altera's IP cores, however, you have to use the lower-level streaming PCIe core, rather than the Qsys core.

    Cheers,

    Dave
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Dave,

    Just to clarify that we're talking about Endpoint, not PCIe Root device.

    In this particular design PCIe is Avalon-MM master. 24-bit address is calculated automatically by Qsys system builder; it probably looks at the highest address of all connected Avalon-MM slaves.

    I didn't have a chance to deal with the case you described. The "worst case" design is 64-bit PCIe Endpoint connected to an embedded processor (PCIe Root) + DMA. So we can control memory allocation scheme on that processor.

    Thanks,

    Evgeni
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Evgeni,

    --- Quote Start ---

    Just to clarify that we're talking about Endpoint, not PCIe Root device.

    --- Quote End ---

    Yes, the use-case I was considering was an "intelligent" PCIe peripheral device that needs to perform high-performance transactions between itself and a host (root complex). Since high-performance is required, the peripheral device needs to support being a bus master/initiator, and perform burst transactions over the PCIe bus. Typically such an interface is implemented using a DMA controller that has one address in whatever is native for the peripheral logic, eg., 32-bit Avalon-MM addresses, another address for the 64-bit PCIe bus, and a direction bit indicating whether the burst is to or from the PCIe bus.

    --- Quote Start ---

    In this particular design PCIe is Avalon-MM master. 24-bit address is calculated automatically by Qsys system builder; it probably looks at the highest address of all connected Avalon-MM slaves.

    --- Quote End ---

    Sorry, this comment is not clear. "PCIe is Avalon-MM master" implies that the peripheral board is a PCIe target, where the PCIe transactions are converted by the Qsys PCIe bridge into Avalon-MM transactions. In this case the host accesses the board via the BAR registers, and the 24-bit address translation window discussed above has nothing to do with the transaction.

    The 24-bit address above configures the width of the TXS Avalon-MM slave/PCIe master. Avalon-MM accesses to that 16MB range get translated into 64-bit PCIe accesses, where the LSB 24-bits are the same as the Avalon-MM address bits, but the (64-24)= 40-bits MSBs are determined by a pseudo-static register setting in the Qsys PCIe bridge control registers. The 40-bit MSBs are pseudo-static in that an SGDMA controller on the Avalon-MM bus cannot easily change the 64-bit PCIe addresses (the scatter-gather list would require 'extra' entries to write a new 40-bit address to change the PCIe MSBs).

    --- Quote Start ---

    I didn't have a chance to deal with the case you described. The "worst case" design is 64-bit PCIe Endpoint connected to an embedded processor (PCIe Root) + DMA. So we can control memory allocation scheme on that processor.

    --- Quote End ---

    In what way is this a worst-case? (just so that I understand)

    Cheers,

    Dave
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Dave,

    There is a PCIe core serves as an Avalon-MM master. User specifies, for example, 64Mbyte BAR, or 26bit. But I found that Qsys recalculates this to lower value, based on the actual addresses of connected Avalon slaves, for example 24 bit

    By "worst case" I meant the most complex PCIe configuration of designs I work with.

    Thanks,

    Evgeni
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Evgeni,

    --- Quote Start ---

    There is a PCIe core serves as an Avalon-MM master. User specifies, for example, 64Mbyte BAR, or 26bit. But I found that Qsys recalculates this to lower value, based on the actual addresses of connected Avalon slaves, for example 24 bit

    --- Quote End ---

    Ok, so your application is to use the Qsys PCIe core as a target device. My "complaints" relate to its short-comings as a PCIe master :)

    --- Quote Start ---

    By "worst case" I meant the most complex PCIe configuration of designs I work with.

    --- Quote End ---

    That's clearer, thanks!

    Cheers,

    Dave
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi Evgeni,

    I saw your jpgs, however, I have questions, why don't you use DMA? and, how to set its TLP for 64-bit OS host PC (Linux)? I'd like to do development based on Reference Design under QSys, however, modifying its TLP is still a mystery for me.

    Thanks.
  • Altera_Forum's avatar
    Altera_Forum
    Icon for Honored Contributor rankHonored Contributor

    Hi,

    I didn't need to use DMA - my designs just do a lot of register accesses.

    Thanks,

    Evgeni