[opensuse-kernel] large data transfer rate slowdows over NFSv4 local lan with kernel 3.19.3x ?
I was just pinged on this by a client; I can reproduce it here. I have two opensuse 13.2 machines. NFS xfers -- cp & rsync -- between them slow to a crawl: < 1 MB/sec in the worst case, over a 1Gb local lan. Chats @ #networking/#nfs suggest this is a kernel+NFS issue. So checking here 1st. Both machines run kernel uname -rm 3.19.3-1.gf10e7fc-default x86_64 Both have NFS installed. Packages include nfs-client-1.3.0-4.2.1.x86_64 nfs-kernel-server-1.3.0-4.2.1.x86_64 The server's store is at /NAS/NAS1 it's on a LV on a software (mdadm v3.3.1) RAID-10 array. The client's mounted it at /mnt/NFS4/NAS1 mount | grep NAS1 xen01.loc:/ on /mnt/NFS4/NAS1 type nfs4 (rw,nosuid,nodev,relatime,sync,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.101,fsc,local_lock=none,addr=10.0.0.1) The diagnostics I can think to do follow. creating test files @ both machines @ server dd if=/dev/zero of=/NAS/NAS1/dump-server-file bs=1024 count=1000000 1000000+0 records in 1000000+0 records out 1024000000 bytes (1.0 GB) copied, 1.37631 s, 744 MB/s @ client dd if=/dev/zero of=~/dump-client-file bs=1024 count=1000000 1000000+0 records in 1000000+0 records out 1024000000 bytes (1.0 GB) copied, 1.988 s, 515 MB/s TESTS (1) server -> server, local cp rm -f /tmp/dump-server-file time /bin/cp /NAS/NAS1/dump-server-file /tmp/ real 0m0.486s user 0m0.004s sys 0m0.480s ~= 2100MB/s (real) ~= 2100MB/s (sys) (2) server -> server, local rsync rm -f /tmp/dump-server-file time /usr/bin/rsync /NAS/NAS1/dump-server-file /tmp/ real 0m2.491s user 0m3.344s sys 0m1.264s ~= 411 MB/s (real) ~= 810 MB/s (sys) (3) client -> server, PING ping -c 10 10.0.0.1 PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data. 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.307 ms 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.280 ms 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.303 ms 64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.280 ms 64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.262 ms 64 bytes from 10.0.0.1: icmp_seq=6 ttl=64 time=0.290 ms 64 bytes from 10.0.0.1: icmp_seq=7 ttl=64 time=0.281 ms 64 bytes from 10.0.0.1: icmp_seq=8 ttl=64 time=0.286 ms 64 bytes from 10.0.0.1: icmp_seq=9 ttl=64 time=0.287 ms 64 bytes from 10.0.0.1: icmp_seq=10 ttl=64 time=0.291 ms --- 10.0.0.1 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 8999ms rtt min/avg/max/mdev = 0.262/0.286/0.307/0.023 ms (4) client -> `iperf3 -s`@server: TCP, one thread iperf3 -c 10.0.0.1 -t 60 -i 15 -F ~/dump-client-file -f M [ ID] Interval Transfer Bandwidth Retr [ 5] 0.00-3.08 sec 337 MBytes 918 Mbits/sec 10 sender [ 5] 0.00-3.08 sec 336 MBytes 915 Mbits/sec receiver (5) client -> `iperf3 -s`@server: TCP, 100 threads iperf3 -c 10.0.0.1 -t 60 -i 15 -F ~/dump-client-file -f M -P100 [ ID] Interval Transfer Bandwidth Retr ... [SUM] 0.00-31.08 sec 3.40 GBytes 940 Mbits/sec 121 sender [SUM] 0.00-31.08 sec 3.39 GBytes 937 Mbits/sec receiver (6) client -> `iperf3 -s`@server: UDP, one thread iperf3 -c 10.0.0.1 -t 60 -i 15 -F ~/dump-client-file -f M -b 1G -P 1 [ ID] Interval Transfer Bandwidth Retr ... [ 5] 0.00-8.97 sec 977 MBytes 913 Mbits/sec 25 sender [ 5] 0.00-8.97 sec 976 MBytes 912 Mbits/sec receiver (7) client -> `iperf3 -s`@server: UDP, 100 threads iperf3 -c 10.0.0.1 -t 60 -i 15 -F ~/dump-client-file -f M -b 1G -P 100 [ ID] Interval Transfer Bandwidth Retr ... [SUM] 0.00-60.01 sec 6.56 GBytes 939 Mbits/sec 180 sender [SUM] 0.00-60.01 sec 6.55 GBytes 937 Mbits/sec receiver (8) client -> server, cp over NFS rm -f /mnt/NFS4/NAS1/dump-client-file time /bin/cp ~/dump-client-file /mnt/NFS4/NAS1/ real 0m54.589s user 0m0.005s sys 0m1.225s ~= 18.75 MB/s (real) ~= 810 MB/s (sys) (9) client -> server, rsync over NFS rm -f /mnt/NFS4/NAS1/dump-client-file time /usr/bin/rsync ~/dump-client-file /mnt/NFS4/NAS1/ real 18m13.408s user 0m4.642s sys 0m2.627s ~= 0.937 MB/s (real) ~= 390 MB/s (sys) If there's additional diagnostic info, I can provide it. What's causing the disproportionate slow down and what's needed for a fix? LT -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
On 03/30/2015 01:57 PM, lyndat3@your-mail.com wrote:
I was just pinged on this by a client; I can reproduce it here.
I have two opensuse 13.2 machines. NFS xfers -- cp & rsync -- between them slow to a crawl: < 1 MB/sec in the worst case, over a 1Gb local lan.
Chats @ #networking/#nfs suggest this is a kernel+NFS issue. So checking here 1st.
Both machines run kernel
uname -rm 3.19.3-1.gf10e7fc-default x86_64
Have you or your client tested in this way with earlier kernels to see if this is a regression? If it is, then you can clone the mainline repository and use 'git bisect' to see what commit causes the failure. Larry -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
Have you or your client tested in this way with earlier kernels to see if this is a regression? If it is, then you can clone the mainline repository and use 'git bisect' to see what commit causes the failure.
Was already working on that; It isn't apparently. Downgrading kernel-default to uname -rm 3.16.7-7-default x86_64 Retesting the slow cases, there's no significant change (8) client -> server, cp over NFS rm -f /mnt/NFS4/XNAS/dump-client-file time /bin/cp ~/dump-client-file /mnt/NFS4/XNAS/ real 0m56.064s user 0m0.003s sys 0m1.266s ~= 18.26 MB/s (real) ~= 809 MB/s (sys) (9) client -> server, rsync over NFS rm -f /mnt/NFS4/XNAS/dump-client-file time /usr/bin/rsync ~/dump-client-file /mnt/NFS4/XNAS/ real 17m59.312s user 0m4.116s sys 0m2.226s ~= 0.949 MB/s (real) ~= 460 MB/s (sys) LT -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
one more test, ruling out rsync alone as above for remote dir /NAS/NAS1 locally NFS-mounted as /mnt/NFS4/NAS1/ where mount | egrep "NFS|NAS1" | grep -v LV /etc/auto.nfs4 on /mnt/NFS4 type autofs (rw,relatime,fd=6,pgrp=2619,timeout=10,minproto=5,maxproto=5,indirect) xen01.loc:/ on /mnt/NFS4/NAS1 type nfs4 (rw,nosuid,nodev,relatime,sync,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,nosharecache,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.101,fsc,local_lock=none,addr=10.0.0.1) (1) rsync, no NFS time /usr/bin/rsync ~/dump-client-file root@xen01.loc:/NAS/NAS1 real 0m19.179s user 0m16.505s sys 0m4.135s (2) rsync, over NFS time /usr/bin/rsync ~/dump-client-file /mnt/NFS4/NAS1/ real 18m25.726s user 0m4.647s sys 0m2.912s -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
There's strong block-size dependecy. Smaller bs are even slower -- speed maxes out at ~ 1MB/sec for dd wih bs > 512K dd if=/dev/zero of=/mnt/NFS4/NAS1/dump-client-file bs=16k count=2k 2048+0 records in 2048+0 records out 33554432 bytes (34 MB) copied, 443.592 s, 75.6 kB/s dd if=/dev/zero of=/mnt/NFS4/NAS1/dump-client-file bs=256k count=128 128+0 records in 128+0 records out 33554432 bytes (34 MB) copied, 72.2837 s, 464 kB/s dd if=/dev/zero of=/mnt/NFS4/NAS1/dump-client-file bs=512k count=64 64+0 records in 64+0 records out 33554432 bytes (34 MB) copied, 33.7238 s, 995 kB/s dd if=/dev/zero of=/mnt/NFS4/NAS1/dump-client-file bs=1M count=32 32+0 records in 32+0 records out 33554432 bytes (34 MB) copied, 31.7636 s, 1.1 MB/s dd if=/dev/zero of=~/dump-client-file bs=1M count=32 time rsync -P ~/dump-client-file /mnt/NFS4/NAS1/dump-client-file dump-client-file2 33,554,432 100% 484.97kB/s 0:01:07 (xfr#1, to-chk=0/1) real 1m8.140s user 0m0.172s sys 0m0.083s LT -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
It looks like 'sync' is the problem grep NAS1 /etc/auto.nfs4 NAS1 -fstype=nfs4,_netdev,rw,proto=tcp,sync,... xen01.loc:/ rm -f /mnt/NFS4/NAS1/file.out && \ time dd if=/dev/zero of=/mnt/NFS4/NAS1/file.out bs=32K count=3K 3072+0 records in 3072+0 records out 100663296 bytes (101 MB) copied, 485.721 s, 207 kB/s real 8m5.861s user 0m0.012s sys 0m0.250s Change mount 'sync' -> 'sync' vi /etc/auto.nfs4 - NAS1 -fstype=nfs4,_netdev,rw,proto=tcp,sync,... xen01.loc:/ + NAS1 -fstype=nfs4,_netdev,rw,proto=tcp,async,... xen01.loc:/ systemctl restart autofs Repeat rm -f /mnt/NFS4/NAS1/file.out && \ time dd if=/dev/zero of=/mnt/NFS4/NAS1/file.out bs=32K count=3K 3072+0 records in 3072+0 records out 100663296 bytes (101 MB) copied, 1.65577 s, 60.8 MB/s real 0m1.658s user 0m0.000s sys 0m0.089s I'd expect 'sync' to be slower than 'async', but 300X ? The NFS default is 'sync' for a reason -- integrity of the data. At this performance hit, it's not a realistic option. What can be optimized to get 'sync' back in the game? Or is there possibly another dependency - kernel? netwk stack? other? - that's involved? LT -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
Have you or your client tested in this way with earlier kernels to see if this is a regression? If it is, then you can clone the mainline repository and use 'git bisect' to see what commit causes the failure.
Even though I didn't see a kernel dependency from 3.19 -> 3.16 I can say that digging online re: "nfs slow sync" finds LOTS of references to kernel problems Problem is most of them are really old. For example 2.4.18-4,5: very slow 'sync' nfs writes (~50k/sec) https://bugzilla.redhat.com/show_bug.cgi?id=67199 Makes me wonder if the problem cropped into kernel even earlier. I can't find anything yet that's a definite solution to this -- but 'slow nfs', particularly with 'sync' is a really popular problem. I just don't know yet. LT -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
After comment from one of the nfs client maintainers, the issue is simply one of config. There are two *separate* syncs to consider -- at the server, and at the client. 'sync' on the EXPORT, and 'async' on the MOUNT is the sane approach; That config also appears to return the performance. The many recommendations online to use 'sync' for data integrity are IIUC for sync on the server. LT -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
В Tue, 31 Mar 2015 20:30:25 -0700 lyndat3@your-mail.com пишет:
After comment from one of the nfs client maintainers, the issue is simply one of config.
There are two *separate* syncs to consider -- at the server, and at the client.
'sync' on the EXPORT, and 'async' on the MOUNT is the sane approach;
sync on export should not be necessary - server must persist data when client explicitly requests it and as long as client does not request it, it probably does not care anyway. But I do not know if Linux NFS server does implement it correctly.
That config also appears to return the performance.
The many recommendations online to use 'sync' for data integrity are IIUC for sync on the server.
LT
-- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
sync on export should not be necessary - server must persist data when client explicitly requests it and as long as client does not request it, it probably does not care anyway.
IIUC sync is, apparently, the server default. this was helpful http://serverfault.com/questions/499174/etc-exports-mount-option/500553#5005... LT -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
participants (3)
-
Andrei Borzenkov
-
Larry Finger
-
lyndat3@your-mail.com