https://bugzilla.suse.com/show_bug.cgi?id=1203603 Bug ID: 1203603 Summary: UDP throughput on 100 Gbit NIC low compared to TCP Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: openSUSE Tumbleweed Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: jwiesner@suse.com Reporter: jwiesner@suse.com QA Contact: qa-bugs@suse.de CC: yousaf.kaukab@suse.com Found By: --- Blocker: --- Iperf3 was used for benchmarking on simba2.arch.suse.cz and simba3.arch.suse.cz. Both machines have 64 logical CPUs, Intel Xeon Gold 6326 CPU @ 2.90GHz, assigned to 2 NUMA nodes (NUMA node1 CPUs: 16-31,48-63) and a 100 Gbit NIC, eth0, that is local to NUMA node 1 (see below). TCP throughput results depend on whether both the transmitting and receiving iperf3 process run on NUMA node 1. If they do the resulting throughput approaches the maximum throughput allowed by the physical layer (95 Gbit/s):
simba3:~/:[1]# taskset 0xffff0000 /root/jwiesner/iperf/src/iperf3 -c 10.100.128.66 -P1 -f m -b 0 -t 180 -i 5 -O 2 Connecting to host 10.100.128.66, port 5201 [ 5] local 10.100.128.68 port 54800 connected to 10.100.128.66 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-5.00 sec 54.2 GBytes 93056 Mbits/sec 0 3.01 MBytes [ 5] 5.00-10.00 sec 54.2 GBytes 93079 Mbits/sec 0 3.01 MBytes [ 5] 10.00-15.00 sec 54.7 GBytes 93892 Mbits/sec 0 3.01 MBytes [ 5] 15.00-20.00 sec 54.2 GBytes 93083 Mbits/sec 0 3.01 MBytes Mpstat output on simba3, which ran the client process: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 18 0.00 0.00 0.10 0.10 0.00 0.00 0.00 0.00 0.00 99.80 19 0.00 0.00 0.00 0.00 0.00 36.98 0.00 0.00 0.00 63.02 20 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 29 0.10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 99.90 30 1.01 0.00 82.41 0.00 0.00 0.00 0.00 0.00 0.00 16.58 31 0.10 0.00 0.20 0.00 0.00 0.00 0.00 0.00 0.00 99.70 59 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 60 0.00 0.00 0.10 0.00 0.00 26.59 0.00 0.00 0.00 73.31 61 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 The iperf3 client ran on CPU 30, softirq processing (mostly receiving TCP ACKs) ran on CPUs 19 and 60. Mpstat output on simba2, which ran the server process: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 18 2.01 0.00 96.18 0.00 0.00 0.00 0.00 0.00 0.00 1.81 19 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 27 0.00 0.00 0.00 0.00 0.00 99.12 0.00 0.00 0.00 0.88 28 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 The iperf3 client ran on CPU 18, softirq processing ran on CPU 27. The CPU running softirq processing on the server is the bottleneck.
simba3:~/:[0]# taskset 0xffff0000 /root/jwiesner/iperf/src/iperf3 -u -c 10.100.128.66 -P1 -f m -b 0 -t 180 -i 5 -O 2 Connecting to host 10.100.128.66, port 5201 [ 5] local 10.100.128.68 port 44747 connected to 10.100.128.66 port 5201 [ ID] Interval Transfer Bitrate Total Datagrams [ 5] 0.00-5.00 sec 6.21 GBytes 10664 Mbits/sec 5804050 [ 5] 5.00-10.00 sec 6.25 GBytes 10733 Mbits/sec 4632580 [ 5] 10.00-15.00 sec 6.24 GBytes 10721 Mbits/sec 4627310 [ 5] 15.00-20.00 sec 6.25 GBytes 10730 Mbits/sec 4631300 Mpstat output on simba3, which ran the client process: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 17 11.60 0.00 88.40 0.00 0.00 0.00 0.00 0.00 0.00 0.00 18 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 23 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 24 0.00 0.00 0.00 0.00 0.00 27.67 0.00 0.00 0.00 72.33 25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 The iperf3 client ran on CPU 17, softirq processing (freeing transmitted buffers) ran on CPUs 24. Mpstat output on simba2, which ran the server process: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 19 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 20 12.44 0.00 86.96 0.00 0.00 0.00 0.00 0.00 0.00 0.60 21 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 49 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 50 0.00 0.00 0.22 0.00 0.00 96.12 0.00 0.00 0.00 3.66 51 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 The iperf3 client ran on CPU 20, softirq processing ran on CPU 50. The CPU running iperf3 were the bottleneck, with the CPU running softirq processing on
In contrast to the TCP results, UDP throughput is more than 8 times lower under default settings: the simba2 being a close second. The NUMA node locality and SMP affinity of the interrupts of the 100 Gbit NIC:
simba3:~/:[0]# for i in $(awk '/eth0/{sub(":", "", $1); print $1}' /proc/interrupts); do grep -rH . /proc/irq/$i/{node,smp_affinity}; done /proc/irq/284/node:1 /proc/irq/284/smp_affinity:00040000,00000000 /proc/irq/285/node:1 /proc/irq/285/smp_affinity:00000000,00020000 /proc/irq/286/node:1 /proc/irq/286/smp_affinity:00200000,00000000 /proc/irq/287/node:1 /proc/irq/287/smp_affinity:00080000,00000000 /proc/irq/288/node:1 /proc/irq/288/smp_affinity:04000000,00000000 /proc/irq/289/node:1 /proc/irq/289/smp_affinity:00800000,00000000 /proc/irq/290/node:1 /proc/irq/290/smp_affinity:00200000,00000000 /proc/irq/291/node:1 /proc/irq/291/smp_affinity:00000000,40000000 /proc/irq/292/node:1 /proc/irq/292/smp_affinity:10000000,00000000 /proc/irq/293/node:1 /proc/irq/293/smp_affinity:00000000,00100000 /proc/irq/294/node:1 /proc/irq/294/smp_affinity:00000000,00200000 /proc/irq/295/node:1 /proc/irq/295/smp_affinity:20000000,00000000 /proc/irq/296/node:1 /proc/irq/296/smp_affinity:00000000,04000000 /proc/irq/297/node:1 /proc/irq/297/smp_affinity:00000000,00800000 /proc/irq/298/node:1 /proc/irq/298/smp_affinity:00000000,01000000 /proc/irq/299/node:1 /proc/irq/299/smp_affinity:80000000,00000000
-- You are receiving this mail because: You are on the CC list for the bug.