Forum Discussion

Ashoo's avatar
Ashoo
Icon for New Contributor rankNew Contributor
19 days ago

MCDMA IP D2H Queue Reset Failure during Channel Re-allocation

After successfully loading the binary and launching D2H (Device-to-Host) operations on my FPGA, the first channel is allocated successfully and transfers data without issue. However, when I try to allocate another channel, it fails with a "queue reset failed" error.

Initially, during the initialization phase, all 512 channels were checked and appeared available.

Environment:

Device Family: [Agilex 7]

Quartus Version: [24.1]

Driver/Software: VFIO based 

IP Config: MCDMA configured with 512 channels

Why is this happening? How to rectify the same ?

8 Replies

  • Wincent_Altera's avatar
    Wincent_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,

    First and foremost, which design and targeted you are verifying for this issue ?
    Is this a custom design ? or the design from our design example ?

    • IF this is custom design , I do suggest you to try our our design example as it is well validated.
    • IF you already using the example design, can you please provide me any of you failing log ? for example the dmseg log.

    Regards,

    Wincent_Altera 

  • Ashoo's avatar
    Ashoo
    Icon for New Contributor rankNew Contributor

    Hi Wincent,

     

    Thanks for your reply

     

    I am using the Altera MCDMA IP example design. When I test this example design with the software custom driver provided by Altera, everything works perfectly fine.

     

    The issue only happens when I use our custom software driver. Here is exactly what is happening:

     

    The first channel allocates perfectly and performs its read work without any issues.

    However, when I try to allocate a second channel while the first one is still working, it fails.

    It shows a "queue reset failed" and "channel allocation failed" error.

     

    Could you please let me know:

     

    Why does this happen when adding a second channel in our custom software?

    What could I have done wrong in my custom driver logic?

    What are the exact register steps or conditions I need to check in my code to rectify this?

     

    Thanks for your help, Ashoo

    • Wincent_Altera's avatar
      Wincent_Altera
      Icon for Regular Contributor rankRegular Contributor

      Hi Ashoo ,

      Your custom driver most likely has a flaw in the channel initialization, reset, or resource management sequence. 
      Carefully compare your register access flow to the reference driver and the our design example
      Paying special attention to channel reset, status polling, and queue setup. 

      BUT to confirm that , please do check your dmesg log.

      Regards,

      Wincent

      • Ashoo's avatar
        Ashoo
        Icon for New Contributor rankNew Contributor

        Hi Wincent

        We checked the sequence of channel initialization, queue reset, and resource management and made sure it was according to the official guide, but still the problem persists, nothing unusual was found in desmg log. Therefore we used gdb to check the actual line of code where the problem was occuring.

        According to our analysis, problem occured in queue_reset function of ifc and mostly due to timeout being exeeded. Therefore we commented out the timeout block and re-compiled the libmqdmasoc.so lib and the problem resolved, and we are able to create the desired number of channels.

        The natural doubt which occured to us was why was the timeout present in the code in the first place ?, Why is that value defined to be 2048 micro seconds?
        If we comment it what drawbacks we will get ?

        Please give us concrete reasoning.
        I am attaching the screenshot of snippet of queue reset function call which is present in ifc_mcdma.c file.
        Regards,

        Ashoo