Forum Discussion
Hi Silvan,
I think I was able to reproduce the issue on our setup consistently, just by running the iperf3 server with UDP 64-byte packets sent from the host machine at line rate.
root@arria10:~# iperf3 -s ----------------------------------------------------------- Server listening on 5201 (test #1) ----------------------------------------------------------- Accepted connection from 192.168.2.100, port 37872 [ 5] local 192.168.2.40 port 5201 connected to 192.168.2.100 port 57154 [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-1.00 sec 6.25 MBytes 52.4 Mbits/sec 0.006 ms 154296/256751 (60%) [ 5] 1.00-2.00 sec 6.27 MBytes 52.6 Mbits/sec 0.006 ms 156917/259573 (60%) [ 5] 2.00-3.00 sec 6.26 MBytes 52.5 Mbits/sec 0.007 ms 157229/259797 (61%) [ 5] 3.00-4.00 sec 6.26 MBytes 52.5 Mbits/sec 0.006 ms 157367/259983 (61%) [ 5] 4.00-5.00 sec 6.26 MBytes 52.5 Mbits/sec 0.006 ms 157039/259623 (60%) [ 5] 5.00-6.00 sec 6.25 MBytes 52.4 Mbits/sec 0.009 ms 158735/261174 (61%) [ 5] 6.00-7.00 sec 6.25 MBytes 52.5 Mbits/sec 0.012 ms 157964/260420 (61%) [ 5] 7.00-8.00 sec 6.26 MBytes 52.5 Mbits/sec 0.006 ms 162000/264503 (61%) [ 5] 8.00-9.00 sec 6.26 MBytes 52.5 Mbits/sec 0.011 ms 162404/265020 (61%) [ 5] 9.00-10.00 sec 6.27 MBytes 52.6 Mbits/sec 0.006 ms 162334/265030 (61%) [ 5] 10.00-10.42 sec 22.8 KBytes 445 Kbits/sec 12.748 ms 611/976 (63%) - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-10.42 sec 62.6 MBytes 50.4 Mbits/sec 12.748 ms 1586896/2612850 (61%) receiver ----------------------------------------------------------- Server listening on 5201 (test #2) ----------------------------------------------------------- ^Ciperf3: interrupt - the server has terminated root@arria10:~# ping 192.168.2.100 -c 10 PING 192.168.2.100 (192.168.2.100): 56 data bytes 64 bytes from 192.168.2.100: seq=0 ttl=64 time=1000.462 ms 64 bytes from 192.168.2.100: seq=1 ttl=64 time=1000.344 ms 64 bytes from 192.168.2.100: seq=2 ttl=64 time=1000.377 ms 64 bytes from 192.168.2.100: seq=3 ttl=64 time=1000.363 ms 64 bytes from 192.168.2.100: seq=4 ttl=64 time=1000.367 ms 64 bytes from 192.168.2.100: seq=5 ttl=64 time=1000.374 ms 64 bytes from 192.168.2.100: seq=6 ttl=64 time=1000.347 ms 64 bytes from 192.168.2.100: seq=7 ttl=64 time=1000.349 ms 64 bytes from 192.168.2.100: seq=8 ttl=64 time=1000.363 ms --- 192.168.2.100 ping statistics --- 10 packets transmitted, 9 packets received, 10% packet loss round-trip min/avg/max = 1000.344/1000.371/1000.462 ms
As you have pointed out earlier in this thread, the gmacgrp_debug register has a value of 0x00000120 at my setup also. With this, I think there is an issue with the EMAC IP on Arria 10, and the issue seems reproducible when we hit the Rx buffer unavailable condition repeatedly. We need to follow up with Synopsis on this issue.
I tried a different configuration for the MTL RX FIFO and Rx DMA, and I can see the issue is not reproducible in this case. What I've done is set the dff bit of the dmagrp_operation_mode register. By setting this bit, we can disable flushing of the received frames when the Rx buffer unavailable condition is met. But in this setting, we need to write to the dmagrp_receive_poll_demand register after refilling the descriptor ring. Attached a patch for these changes. With these changes, I can see flow control working effectively and with almost 0% packet loss for the same iperf3 test. Also, the issue is not reproducible.
root@arria10:~# iperf3 -s ----------------------------------------------------------- Server listening on 5201 (test #1) ----------------------------------------------------------- Accepted connection from 192.168.2.100, port 43988 [ 5] local 192.168.2.25 port 5201 connected to 192.168.2.100 port 57459 [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-1.00 sec 6.52 MBytes 54.6 Mbits/sec 0.007 ms 858/107663 (0.8%) [ 5] 1.00-2.00 sec 6.58 MBytes 55.2 Mbits/sec 0.007 ms 0/107776 (0%) [ 5] 2.00-3.00 sec 6.50 MBytes 54.5 Mbits/sec 0.012 ms 694/107232 (0.65%) [ 5] 3.00-4.00 sec 6.57 MBytes 55.1 Mbits/sec 0.007 ms 0/107688 (0%) [ 5] 4.00-5.00 sec 6.55 MBytes 55.0 Mbits/sec 0.007 ms 0/107392 (0%) [ 5] 5.00-6.00 sec 6.55 MBytes 55.0 Mbits/sec 0.018 ms 0/107343 (0%) [ 5] 6.00-7.00 sec 6.56 MBytes 55.0 Mbits/sec 0.006 ms 0/107448 (0%) [ 5] 7.00-8.00 sec 6.56 MBytes 55.1 Mbits/sec 0.005 ms 0/107536 (0%) [ 5] 8.00-9.00 sec 6.55 MBytes 55.0 Mbits/sec 0.008 ms 0/107344 (0%) [ 5] 9.00-10.00 sec 6.56 MBytes 55.0 Mbits/sec 0.027 ms 0/107488 (0%) [ 5] 10.00-10.01 sec 86.6 KBytes 54.1 Mbits/sec 0.008 ms 0/1386 (0%) - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-10.01 sec 65.6 MBytes 54.9 Mbits/sec 0.008 ms 1552/1076296 (0.14%) receiver ----------------------------------------------------------- Server listening on 5201 (test #2) ----------------------------------------------------------- ^Ciperf3: interrupt - the server has terminated root@arria10:~# ping 192.168.10.2.100 -c 10 PING 192.168.2.100 (192.168.2.100): 56 data bytes 64 bytes from 192.168.2.100: seq=0 ttl=64 time=0.336 ms 64 bytes from 192.168.2.100: seq=1 ttl=64 time=0.315 ms 64 bytes from 192.168.2.100: seq=2 ttl=64 time=0.294 ms 64 bytes from 192.168.2.100: seq=3 ttl=64 time=0.286 ms 64 bytes from 192.168.2.100: seq=4 ttl=64 time=0.243 ms 64 bytes from 192.168.2.100: seq=5 ttl=64 time=0.237 ms 64 bytes from 192.168.2.100: seq=6 ttl=64 time=0.317 ms 64 bytes from 192.168.2.100: seq=7 ttl=64 time=0.276 ms 64 bytes from 192.168.2.100: seq=8 ttl=64 time=0.249 ms 64 bytes from 192.168.2.100: seq=9 ttl=64 time=0.254 ms --- 192.168.2.100 ping statistics --- 10 packets transmitted, 10 packets received, 0% packet loss round-trip min/avg/max = 0.237/0.280/0.336 ms
Attached patch for these changes. This patch is for checking the behavior on your side and is on socfpga-6.12.11-lts branch. This patch is not fully tested. Once booted with the image having the above patch you can use devmem2 tool to make sure dff bit is set for the dmagrp_operation_mode register and after that you can repeat the above test and can confirm issue is reproducible or not. Atleast on my side I can see the issue is not reproducible.
root@arria10:~# devmem2 0xFF801018 /dev/mem opened. Memory mapped at address 0xb6fad000. Read at address 0xFF801018 (0xb6fad018): 0x03202906
Please try and confirm the behavior on your setup.
Best Regards,
Rohan
Hi Rohan,
Thank you so much for the description and the provided patch. I applied the patch for the dev board and it works as expected.
Additionally, I created the same patch for kernel 6.6 and a first short test on our hardware was also successful. I will run more detailed tests on our hardware this week. But I am optimistic that it will work.
For us it would be beneficial if the patch becomes available in the mainline version of the Linux kernel. Do you think, this could be possible? If required, I will support you for that.
For the discussion with Synopsis I have a few additional info:
We have also a old Cyclone V Dev Board. As I saw in the device documentation, there is a similar MAC. Version 3.70a in Cyclone V and 3.72a in Arria 10.
I did the same test on the Cyclone V and observed the same issue as well.