Hi, currently I'm thinking to implement an own Avalon <-> AXI4 (MM) adapter and not using the QSYS autogenerated adapter. Currently we are using an AXI4 DMA which will stream data into the DDR4. Because the DDR4 is using an Avalon interface I already tried the autogenerated converter which is too slow to support the data rates. Even with pending transactions set to 64 & burst size 16 we are not able to achieve data rate > 3 Gbps (AXI has burst size 16, too). DDR4 (1600 MHz) Avalon running with 200 MHz @ 512b and DMA with 160 MHz @ 256b. Due the DMA on the 160 MHz got overflows I can be sure the transaction is the issue (I did not expect this because I have a higher frequency and double data width on receiving side..) We had the same issues for AXI DMA <-> AXI HBM2 as well. Here we implemented our own AXI <-> AXI connection which was much better and supports our required data rates much better than the autogenerated AXI converter (verified in simulation & on FPGA). Regarding this we already had several debug sessions with Premier support - final result was to NOT use the autogenerated adapter and use our own... Again in DDR4 we are facing the same throughput limitations (AXI <-> Avalon is the limitation). Could you give me advices to implement this conversion? Are there any data sheets which already describe the adapter autogenerated by QSYS? Furthermore I saw I can edit the maximum pending read transactions on the DDR4 EMIF core & on Avalon Clock Crossing Bridges but not the maximum pending write transactions. Is there a reason why I cannot edit these parameters in those IP cores? Kind regards, Michael

Hi Pramod, we did a similar architecture where we had multiple masters connected to a single HBM2 channel + PCIe DMA access, too.Here the HBM2 burst controller seems to have a bug in some specific Quartus versions (Q20.4 and lower).We succeeded with a solution provided by Intel to patch the IP files after IP generation (yes, you won't be able to use the common tool flow anymore sadly). With Quartus 21.1 you will have this fix provided in the IP again. https://www.intel.com/content/www/us/en/support/programmable/articles/000086781.htmlFor us it was a long way to debug this. Hopefully this will help you. FYI:We currently didn't switch to 21.1 because with 20.4 we got no timing violations in our design and with 21.1 the retiming process is not working properly again. With this we got tremendous timing violations and there we did not get a clear information from Intel support team why this is happening with a version upgrade.Let's see if this is fixed in further versions... Kind regards, Michael

Hi @MichaelB Sorry for the delay in response. You may checkout the User Guide below for more information on the adapter autogenerated by Platform Designer. https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-qpp-platform-designer.pdf Could you share a screenshot of the DDR4 EMIF core & Avalon Clock Crossing Bridges that shown you can edit the maximum pending read transactions but not the maximum pending write transactions? Best Regards,Richard Tanp/s: If any answer from the community or Intel support are helpful, please feel free to give Kudos.

Hi Richard, thanks for your reply! In the EMIF DDR4 & Avalon CCB settings I'd like to increase pending writes (CCB Avalon_M will write data to Avalon_S of DDR4): AXI_M (write) -> Avalon CCB -> DDR4 AXI Burst length = 16 (data width = 256) & Avalon Burst length = 8 (data width = 512) based on my calculation: 16*256 = 8*512 Will the interconnect resolve data width conversion and align the bursts? Screenshots of CCB & EMIF: EMIF DDR4 parametersEMIF DDR4 AVM settingsAvalon CCB AVM settingsAvalon CCB parameters Let me know if you need further information! Best regards, Michael

Hi @MichaelB From what I found, the DDR4 EMIF core & Avalon MM Clock Crossing Bridges does not seem to support maximum pending write transactions. The interface must have both response and writeresponsevalid signals which these IP does not have. You may create a custom component though by adding the respective signals. fyi, the maximum pending read transactions need readdatavalid signal. Will the interconnect resolve data width conversion and align the bursts? Make sure there is no error in the system message and the platform designer should take care most of the interconnect between the interface. You may check the chapter 5.1. Memory-Mapped Interfaces for further details: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-qpp-platform-designer.pdf#page=208 If you have further question on EMIF, I would recommend you to open a new forum case on EMIF related questions. As I am not an expert in EMIF unfortunately. Best Regards,Richard Tanp/s: If any answer from the community or Intel support are helpful, please feel free to give Kudos.

Hi Richard, yes, I recognised this, too, that here are some signals missing (EMIF core & Avalon CCB). Is there an option in the Avalon CCB & EMIF core to enable those? How can I create a custom component for an standard IP? Would this be a custom component instantiating the EMIF core? I tried to edit the interface of those IP cores in the component section in QSYS but I cannot add further signals. I already read through the EMIF user guide (https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/stratix-10/ug-s10-emi.pdf) but I could not found any setting to enable/disable pending write transactions. Currently I am using the autogenerated interconnect from QSYS to resolve 256b AXI (BL = 16) to 512b Avalon (BL = 8). Here I did not see any errors in the QSYS only the hint that an Avalon adapter will be inserted between AXI <-> Avalon. Are there any special settings of the mm_interconnect I have to configure to do the bus conversion? From the documentation of the Platform Designer I assumed this will be done by mm_interconnect automatically. Do you have any reference design (QSYS) where a AXI <-> Avalon connection with bitwidth conversion + burst conversion is done? That would be helpful to understand the settings on both sides to align them for the best throughput performance. Kind regards, Michael

Avalon to AXI implementation | Altera Community

14 Replies

RichardT_altera
Super Contributor
4 years ago
Hi @MichaelB

Sorry for the delay in response.

You may checkout the User Guide below for more information on the adapter autogenerated by Platform Designer.

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-qpp-platform-designer.pdf

Could you share a screenshot of the DDR4 EMIF core & Avalon Clock Crossing Bridges that shown you can edit the maximum pending read transactions but not the maximum pending write transactions?

Best Regards,
Richard Tan

p/s: If any answer from the community or Intel support are helpful, please feel free to give Kudos.
- MichaelB
  Occasional Contributor
  4 years ago
  Hi Richard,
  
  thanks for your reply!
  
  In the EMIF DDR4 & Avalon CCB settings I'd like to increase pending writes (CCB Avalon_M will write data to Avalon_S of DDR4):
  
  AXI_M (write) -> Avalon CCB -> DDR4
  
  AXI Burst length = 16 (data width = 256) & Avalon Burst length = 8 (data width = 512) based on my calculation:
  
  16*256 = 8*512
  
  Will the interconnect resolve data width conversion and align the bursts?
  
  Screenshots of CCB & EMIF:
  
  EMIF DDR4 parametersEMIF DDR4 AVM settingsAvalon CCB AVM settingsAvalon CCB parameters
  
  Let me know if you need further information!
  
  Best regards,
  
  Michael
RichardT_altera
Super Contributor
4 years ago
Hi @MichaelB

From what I found, the DDR4 EMIF core & Avalon MM Clock Crossing Bridges does not seem to support maximum pending write transactions. The interface must have both response and writeresponsevalid signals which these IP does not have. You may create a custom component though by adding the respective signals. fyi, the maximum pending read transactions need readdatavalid signal.

Will the interconnect resolve data width conversion and align the bursts?

Make sure there is no error in the system message and the platform designer should take care most of the interconnect between the interface.

You may check the chapter 5.1. Memory-Mapped Interfaces for further details:

https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-qpp-platform-designer.pdf#page=208

If you have further question on EMIF, I would recommend you to open a new forum case on EMIF related questions. As I am not an expert in EMIF unfortunately.

Best Regards,
Richard Tan

p/s: If any answer from the community or Intel support are helpful, please feel free to give Kudos.
- MichaelB
  Occasional Contributor
  4 years ago
  Hi Richard,
  
  yes, I recognised this, too, that here are some signals missing (EMIF core & Avalon CCB).
  
  Is there an option in the Avalon CCB & EMIF core to enable those?
  
  How can I create a custom component for an standard IP? Would this be a custom component instantiating the EMIF core?
  
  I tried to edit the interface of those IP cores in the component section in QSYS but I cannot add further signals.
  
  I already read through the EMIF user guide (https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/stratix-10/ug-s10-emi.pdf) but I could not found any setting to enable/disable pending write transactions.
  
  Currently I am using the autogenerated interconnect from QSYS to resolve 256b AXI (BL = 16) to 512b Avalon (BL = 8).
  
  Here I did not see any errors in the QSYS only the hint that an Avalon adapter will be inserted between AXI <-> Avalon.
  
  Are there any special settings of the mm_interconnect I have to configure to do the bus conversion?
  
  From the documentation of the Platform Designer I assumed this will be done by mm_interconnect automatically.
  
  Do you have any reference design (QSYS) where a AXI <-> Avalon connection with bitwidth conversion + burst conversion is done?
  
  That would be helpful to understand the settings on both sides to align them for the best throughput performance.
  
  Kind regards,
  
  Michael
  - RichardT_altera
    Super Contributor
    4 years ago
    Hi @MichaelB
    
    Is there an option in the Avalon CCB & EMIF core to enable those?
    
    Unfortunately I do not see a way to enable those.
    
    How can I create a custom component for an standard IP? Would this be a custom component instantiating the EMIF core?
    
    You may launch the Component Editor by double-clicking New Component at the top of the IP Catalog or by selecting New Component from the File menu.
    
    You may checkout the training video below. Chapter 9. Creating custom component.
    
    https://www.intel.com/content/www/us/en/programmable/support/training/course/oqsys3000.html
    
    Are there any special settings of the mm_interconnect I have to configure to do the bus conversion?
    
    You do not need to configure anything. As you mentioned, this will be done by mm_interconnect automatically.
    
    https://www.youtube.com/watch?v=LdD2B1x-5vo
    
    Do you have any reference design (QSYS) where a AXI <-> Avalon connection with bitwidth conversion + burst conversion is done?
    
    This is the closest design example that I found with AXI and Avalon connection.
    
    https://www.intel.com/content/altera-www/global/en_us/index/support/support-resources/design-examples/design-software/qsys/exm-demo-axi3-memory.html
    
    Best Regards,
    Richard Tan
    
    p/s: If any answer from the community or Intel support are helpful, please feel free to give Kudos.
MichaelB
Occasional Contributor
4 years ago
Hi Richard,

yes, I won't use the outstanding transactions due it is not supported by EMIF core anyway.

Furthermore I think it is not very beneficial to edit a standard component with an own component due the outstanding is not supported anyway by the EMIF core itself.

Would you recommend to use a CCB or an autogenerated CCB between two Pipeline Bridges? (mm_interconnect)

Here I would then configure without any outstanding transactions and just defining the BURST size.

Would this be a valid design for high throughput from AXI master to DDR?

Again this is my main reason why I opened this thread.

With > 3 Gbps I would get overflows on the AXI master side - here I'm running with 160 MHz @ 256b and don't know why I have overflows.
DDR is working with 200 MHz @ 512b and it doesn't make sense for me why I get overflows then - We faced such issues previously and we checked in simulation that Avalon <-> AXI (mm_interconnect) does not response fast enough with a valid indication.

Furthermore we figured out to do protocol conversion and CDC (AXI 160 MHz @ 256b <-> Avalon 200 MHz @ 512b) is even worse in throughput than doing the protocol conversion first and then the CDC from Avalon <-> Avalon only.

Here I really want to be sure to support a data rate > 10 Gbps which should be possible with a BURST size of 32 in a 160 MHz @ 256b domain.

Would you recommend to just set max. read/write outstanding to 0 and just do the connection with Avalon <-> AXI BURST/bitwidth conversion?

Kind regards,

Michael
Pramod_atintel
New Contributor
4 years ago
Hi Michael,

I am building a system with PCIe endpoint <-> AXI <-> HBM2.

I had a query regarding whether AXI interface able to support larger burstcount ? I have enabled burstcount greater than 32 in HBM controller.

Still, when I increase the burstcount in software to greater than 2, the HBM controller doesnt respond with data.

Did you face such issues interfacing AXI with HBM

Best Regards,

Pramod
- MichaelB
  Occasional Contributor
  4 years ago
  Hi Pramod,
  
  we did a similar architecture where we had multiple masters connected to a single HBM2 channel + PCIe DMA access, too.
  Here the HBM2 burst controller seems to have a bug in some specific Quartus versions (Q20.4 and lower).
  
  We succeeded with a solution provided by Intel to patch the IP files after IP generation (yes, you won't be able to use the common tool flow anymore sadly). With Quartus 21.1 you will have this fix provided in the IP again.
  
  https://www.intel.com/content/www/us/en/support/programmable/articles/000086781.html
  
  For us it was a long way to debug this. Hopefully this will help you.
  
  FYI:
  We currently didn't switch to 21.1 because with 20.4 we got no timing violations in our design and with 21.1 the retiming process is not working properly again. With this we got tremendous timing violations and there we did not get a clear information from Intel support team why this is happening with a version upgrade.
  Let's see if this is fixed in further versions...
  
  Kind regards,
  
  Michael
  - Pramod_atintel
    New Contributor
    4 years ago
    Hi Michael,
    
    Thanks for the reply.
    
    I am using Quartus 21.3 version. That should have solved the burst count issue, but still I am seeing the same issue.
    
    I will try replacing the auto-generated file from the link you sent and check.
    
    In 21.3, I am getting some timing violations, but most of them are false paths (signaltap related).
    
    Did you use AVMM or AVST for PCIe interface ?
    
    Best Regards,
    
    Pramod
Pramod_atintel
New Contributor
4 years ago
Hi Michael,

In the DMA transfer from PCIe <-> HBM, i am seeing very high data rate for receive port (Rx, from FPGA to host) and very less bandwidth for transmitter port (Tx, from host PC to FPGA).

Did you face such issues ?

Regards,

Pramod