[opensuse] NFS Client freezes at large file transfers
Hi, my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution? Thanks! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-13 14:03, Florian Gleixner wrote:
Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Depending on how you use dd, it will reserve previously a lot of ram: dd if=/dev/zero of=somefile bs=2G will use 2 GiB of ram before writing a single byte to disk. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/Yg44ACgkQIvFNjefEBxowcQCgpoWG050xKkLdZvvrFdkoS48D l+YAniiddv7j7JNOS5A8TbP139lGkfLE =gnc4 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/13/2012 02:11 PM, Carlos E. R. wrote:
On 2012-06-13 14:03, Florian Gleixner wrote:
Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Depending on how you use dd, it will reserve previously a lot of ram:
dd if=/dev/zero of=somefile bs=2G
will use 2 GiB of ram before writing a single byte to disk.
The dd bs was set to 1M, and cp or mv a large enough file triggers the bug too. Thanks anyway! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wed, Jun 13, 2012 at 8:03 AM, Florian Gleixner
Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Thanks! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
It is hard to say why things like this happen. Here's one experience I've had. If the network is weak and can't handle a massive influx of packets sometimes sending few large packets is a better idea. Mount your NFS partition using large rsize/wsize settings (say, 1MB) and see if that helps. Boris. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/13/2012 04:03 PM, Boris Epstein wrote:
On Wed, Jun 13, 2012 at 8:03 AM, Florian Gleixner
wrote: Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Thanks! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
It is hard to say why things like this happen. Here's one experience I've had.
If the network is weak and can't handle a massive influx of packets sometimes sending few large packets is a better idea. Mount your NFS partition using large rsize/wsize settings (say, 1MB) and see if that helps.
Boris.
Network is GBit switched - no signs of errors at the interface counters. It also happens if i write the file from my laptop with wlan connection. rsize and wsize are default: 262144 bytes. Before i had it at 32k, and then i realized that the default is already higher. The freezes are still there - so i think setting these to 1M will not really help. It also happens with nfs3 and nfs4. Thanks anyway
On 06/13/2012 08:21 AM, Florian Gleixner wrote:
On 06/13/2012 04:03 PM, Boris Epstein wrote:
On Wed, Jun 13, 2012 at 8:03 AM, Florian Gleixner
wrote: Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Thanks! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
It is hard to say why things like this happen. Here's one experience I've had.
If the network is weak and can't handle a massive influx of packets sometimes sending few large packets is a better idea. Mount your NFS partition using large rsize/wsize settings (say, 1MB) and see if that helps.
Boris.
Network is GBit switched - no signs of errors at the interface counters. It also happens if i write the file from my laptop with wlan connection. rsize and wsize are default: 262144 bytes. Before i had it at 32k, and then i realized that the default is already higher. The freezes are still there - so i think setting these to 1M will not really help. It also happens with nfs3 and nfs4.
It doesn't much matter what the network speed is. What matters most for file transfers, at least is block size. If the block sizes aren't right then block transfer like dd uses will slow down to the speed of deep frozen molasses. IIRC, nfs also transfers blocks. Someone else had a problem similar to yours some years ago. I don't recall the details but the gist was that he found that the blocks being transferred had to be made smaller. If I recall correctly, in his case, he was using dd to transfer a multi-Gb disk image to a remote disk and the default block size turned out to be much larger than the MTU of the network connection. Initially, everything looked fine but the transfer rate quickly slowed, almost exponentially. By reducing the block size to fit within the MTU, allowing for overhead, he was able to get transfer speeds up to at least a reasonable level. In his case, the optimum block size was a bit smaller than the MTU size. IIRC, further analysis indicated that this was due to a flaw in early network hardware design that everyone else simply imitated. It apparently can't handle too many relatively huge blocks efficiently or some such. It was assumed that network hardware designers would realize this and fix it. Perhaps it hasn't been fixed after all. So you might try using smaller blocks for huge files. Increasing the MTU apparently makes things worse. jd -- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/13/2012 07:11 PM, j debert wrote:
On 06/13/2012 08:21 AM, Florian Gleixner wrote:
On 06/13/2012 04:03 PM, Boris Epstein wrote:
On Wed, Jun 13, 2012 at 8:03 AM, Florian Gleixner
wrote: Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Thanks! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
It is hard to say why things like this happen. Here's one experience I've had.
If the network is weak and can't handle a massive influx of packets sometimes sending few large packets is a better idea. Mount your NFS partition using large rsize/wsize settings (say, 1MB) and see if that helps.
Boris.
Network is GBit switched - no signs of errors at the interface counters. It also happens if i write the file from my laptop with wlan connection. rsize and wsize are default: 262144 bytes. Before i had it at 32k, and then i realized that the default is already higher. The freezes are still there - so i think setting these to 1M will not really help. It also happens with nfs3 and nfs4.
It doesn't much matter what the network speed is. What matters most for file transfers, at least is block size. If the block sizes aren't right then block transfer like dd uses will slow down to the speed of deep frozen molasses. IIRC, nfs also transfers blocks.
Someone else had a problem similar to yours some years ago. I don't recall the details but the gist was that he found that the blocks being transferred had to be made smaller. If I recall correctly, in his case, he was using dd to transfer a multi-Gb disk image to a remote disk and the default block size turned out to be much larger than the MTU of the network connection. Initially, everything looked fine but the transfer rate quickly slowed, almost exponentially. By reducing the block size to fit within the MTU, allowing for overhead, he was able to get transfer speeds up to at least a reasonable level. In his case, the optimum block size was a bit smaller than the MTU size.
IIRC, further analysis indicated that this was due to a flaw in early network hardware design that everyone else simply imitated. It apparently can't handle too many relatively huge blocks efficiently or some such. It was assumed that network hardware designers would realize this and fix it. Perhaps it hasn't been fixed after all.
So you might try using smaller blocks for huge files. Increasing the MTU apparently makes things worse.
jd
To make the problem clearer: - the initial problem occured by using cp or mv - not dd. I use dd only to prove that the local disk is not part of the problem by reading from /dev/zero - the system seems to freeze totally for the time of the transfer. The transfer does not slow down, but i cannot use firefox for example. In extreme cases the mouse movements freeze! But i can login via ssh, so the system still works, except processes that use /proc (seen from strace) Block size changes transfer speed, but is does not change the freeze. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/13/2012 05:11 PM, Florian Gleixner wrote:
To make the problem clearer:
- the initial problem occured by using cp or mv - not dd. I use dd only to prove that the local disk is not part of the problem by reading from /dev/zero - the system seems to freeze totally for the time of the transfer. The transfer does not slow down, but i cannot use firefox for example. In extreme cases the mouse movements freeze! But i can login via ssh, so the system still works, except processes that use /proc (seen from strace)
Block size changes transfer speed, but is does not change the freeze.
Ah! Got it now! And your problem seems a little like the one I have with disk io. Come to think of it, there seems to be a trend of block transfers over nfs slowing down since early 2.4 kernel. At one time it was suddenly incredibly slow, especially on old systems, and was fixed rather quickly. But large transfers are still very slow. I've avoided huge transfers since 2.6 was released. The kernel folk would want extensive and absolute proof and not being a kernel type person I wouldn't have any idea how to do that, let alone the time and resources. But perhaps this will sufficiently annoy someone who is able to prosecute this. jd -- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/14/2012 02:11 AM, Florian Gleixner wrote:
To make the problem clearer:
- the initial problem occured by using cp or mv - not dd. I use dd only to prove that the local disk is not part of the problem by reading from /dev/zero - the system seems to freeze totally for the time of the transfer. The transfer does not slow down, but i cannot use firefox for example. In extreme cases the mouse movements freeze! But i can login via ssh, so the system still works, except processes that use /proc (seen from strace)
I tried to reproduce the problem, but instead of using eth0, I used the lo interface - which avoids potential NIC driver problems. $ mkdir /tmp/test $ truncate -s10G /tmp/test/file $ du -h /tmp/test/file ; du -h --apparent-size /tmp/test/file 0 /tmp/test/file 10G /tmp/test/file $ echo '/tmp/test 127.0.0.1(ro,no_subtree_check)' >> /etc/exports $ exportfs -av exporting 127.0.0.1:/tmp/test $ mount -t nfs 127.0.0.1:/tmp/test /mnt $ ifconfig lo | grep RX\ bytes RX bytes:21623669265 (20621.9 Mb) TX bytes:21623669265 (20621.9 Mb) $ dd if=/mnt/file of=/dev/null 20971520+0 records in 20971520+0 records out 10737418240 bytes (11 GB) copied, 24.6555 s, 435 MB/s $ ifconfig lo | grep RX\ bytes RX bytes:32446807849 (30943.6 Mb) TX bytes:32446807849 (30943.6 Mb) You see the 10G data has really been transferred, but no freeze happened. My system seems to use NFS version 3. Can you reproduce this? Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/14/2012 12:08 PM, Bernhard Voelker wrote:
On 06/14/2012 02:11 AM, Florian Gleixner wrote:
To make the problem clearer:
- the initial problem occured by using cp or mv - not dd. I use dd only to prove that the local disk is not part of the problem by reading from /dev/zero - the system seems to freeze totally for the time of the transfer. The transfer does not slow down, but i cannot use firefox for example. In extreme cases the mouse movements freeze! But i can login via ssh, so the system still works, except processes that use /proc (seen from strace)
I tried to reproduce the problem, but instead of using eth0, I used the lo interface - which avoids potential NIC driver problems.
$ mkdir /tmp/test $ truncate -s10G /tmp/test/file $ du -h /tmp/test/file ; du -h --apparent-size /tmp/test/file 0 /tmp/test/file 10G /tmp/test/file $ echo '/tmp/test 127.0.0.1(ro,no_subtree_check)' >> /etc/exports $ exportfs -av exporting 127.0.0.1:/tmp/test $ mount -t nfs 127.0.0.1:/tmp/test /mnt $ ifconfig lo | grep RX\ bytes RX bytes:21623669265 (20621.9 Mb) TX bytes:21623669265 (20621.9 Mb) $ dd if=/mnt/file of=/dev/null 20971520+0 records in 20971520+0 records out 10737418240 bytes (11 GB) copied, 24.6555 s, 435 MB/s $ ifconfig lo | grep RX\ bytes RX bytes:32446807849 (30943.6 Mb) TX bytes:32446807849 (30943.6 Mb)
You see the 10G data has really been transferred, but no freeze happened. My system seems to use NFS version 3.
Can you reproduce this?
Have a nice day, Berny
Nice idea to avoid network related issues, but the problem is not reading from NFS but writing to it. I did a write-test to a localhost nfs server yesterday and it freezed after writing ~150MB. I could not recover from the freeze and after a reboot i could not reproduce the issue. I'll do further tests if the weather will let me :-) Thanks!
On 06/15/2012 10:01 AM, Florian Gleixner wrote:
Can you reproduce this?
Nice idea to avoid network related issues, but the problem is not reading from NFS but writing to it. I did a write-test to a localhost nfs server yesterday and it freezed after writing ~150MB. I could not recover from the freeze and after a reboot i could not reproduce the issue. I'll do further tests if the weather will let me :-)
That means it's related to NFS on your host, not with networking. On my PC, I don't have problems writing 3GB to a local NFS share. Nothing in the logs? Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am 15.06.2012 10:01, schrieb Florian Gleixner:
On 06/14/2012 12:08 PM, Bernhard Voelker wrote:
Nice idea to avoid network related issues, but the problem is not reading from NFS but writing to it. I did a write-test to a localhost nfs server yesterday and it freezed after writing ~150MB. I could not recover from the freeze and after a reboot i could not reproduce the issue. I'll do further tests if the weather will let me :-)
Just an idle question. The server doesn't use the aacraid module? Some years ago I had to redesign the storage of a server because the driver barfed reliably during large file transfers. In the end I had to replace the raid controller. Sandy -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
j debert wrote:
On 06/13/2012 08:21 AM, Florian Gleixner wrote:
On Wed, Jun 13, 2012 at 8:03 AM, Florian Gleixner
wrote: Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Thanks! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
It is hard to say why things like this happen. Here's one experience I've had.
If the network is weak and can't handle a massive influx of packets sometimes sending few large packets is a better idea. Mount your NFS partition using large rsize/wsize settings (say, 1MB) and see if that helps.
Boris. Network is GBit switched - no signs of errors at the interface counters. It also happens if i write the file from my laptop with wlan connection. rsize and wsize are default: 262144 bytes. Before i had it at 32k, and
On 06/13/2012 04:03 PM, Boris Epstein wrote: then i realized that the default is already higher. The freezes are still there - so i think setting these to 1M will not really help. It also happens with nfs3 and nfs4.
It doesn't much matter what the network speed is. What matters most for file transfers, at least is block size. If the block sizes aren't right then block transfer like dd uses will slow down to the speed of deep frozen molasses. IIRC, nfs also transfers blocks.
Someone else had a problem similar to yours some years ago. I don't recall the details but the gist was that he found that the blocks being transferred had to be made smaller. If I recall correctly, in his case, he was using dd to transfer a multi-Gb disk image to a remote disk and the default block size turned out to be much larger than the MTU of the network connection. Initially, everything looked fine but the transfer rate quickly slowed, almost exponentially. By reducing the block size to fit within the MTU, allowing for overhead, he was able to get transfer speeds up to at least a reasonable level. In his case, the optimum block size was a bit smaller than the MTU size.
This is nonsense. The network stack is very efficient at fragmenting and reassembling packets.
IIRC, further analysis indicated that this was due to a flaw in early network hardware design that everyone else simply imitated.
At what date? This certainly wasn't true by the late 1900s.
It apparently can't handle too many relatively huge blocks efficiently or some such. It was assumed that network hardware designers would realize this and fix it. Perhaps it hasn't been fixed after all.
So you might try using smaller blocks for huge files. Increasing the MTU apparently makes things worse.
IMHO, just ignore these suggestions. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Florian Gleixner wrote:
Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Thanks!
Use split, copy the files to the NFS server, and then put it back together with cat cat filepart.1 >> filepart.0 rm filepart.1 cat filepart.2 >> filepart.0 rm filepart.2 . . . cat filepart.n >> filepart.0 rm filepart.n mv filepart.0 file -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am 13.06.2012 14:03, schrieb Florian Gleixner:
Hi,
my NFS client machines (opensuse 12.1) freeze if i try to copy large files to the NFS Server (opensuse 11.4). It seems that the /proc filesystem blocks somewhere and user processes freeze therefore. The files are ~10GB and bigger than RAM+swap (=8GB). It also works if i generate a 10GB file with dd on the nfs mounted filesystem. Can someone confirm? Or has a solution?
Thanks!
Yeah, this is a problem being around for some time. I've seen this happening first in 11.3 and it has been there since then. Even worse, the NFS server became completely unresponsive as well. (which might be the reason for the client behavior) Back then, I read about some race condition where swap was involved and that this is actually a bug that appeared in an early 2.6 kernel version. Well, solution for me is to use rsync for files > NFS server memory. @opensuse: This should really be looked at! Cheers! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (9)
-
Bernhard Voelker
-
Boris Epstein
-
Carlos E. R.
-
Dave Howorth
-
Dirk Gently
-
fishdude
-
Florian Gleixner
-
j debert
-
Sandy Drobic