[opensuse] Data corruption with 3.4 Kernel from Tumbleweed repo
Just a word of caution in case anyone else is boldly tumbling through tumbleweed. I updated to kernel-default-3.4.0-x from the tumbleweed repository yesterday. The machines are primarily used as "nfs servers" and the nfs exports are mounted for use by vmware vSphere hosts. Today, there was massive disk corruption on all files being written to the nfs shares. Not sure if anyone else is seeing this but just something to keep in mind before using the Tumbleweed repo. -- --Moby They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. -- Benjamin Franklin -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* Moby
Just a word of caution in case anyone else is boldly tumbling through tumbleweed. I updated to kernel-default-3.4.0-x from the tumbleweed repository yesterday. The machines are primarily used as "nfs servers" and the nfs exports are mounted for use by vmware vSphere hosts. Today, there was massive disk corruption on all files being written to the nfs shares. Not sure if anyone else is seeing this but just something to keep in mind before using the Tumbleweed repo.
I am running 3.4.x for two days now w/o a recognized problem and just checked my nfs links. ??? -- (paka)Patrick Shanahan Plainfield, Indiana, USA HOG # US1244711 http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 http://en.opensuse.org openSUSE Community Member Registered Linux User #207535 @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 05/26/2012 07:33 AM, Patrick Shanahan wrote:
* Moby
[05-26-12 00:04]: Just a word of caution in case anyone else is boldly tumbling through tumbleweed. I updated to kernel-default-3.4.0-x from the tumbleweed repository yesterday. The machines are primarily used as "nfs servers" and the nfs exports are mounted for use by vmware vSphere hosts. Today, there was massive disk corruption on all files being written to the nfs shares. Not sure if anyone else is seeing this but just something to keep in mind before using the Tumbleweed repo. I am running 3.4.x for two days now w/o a recognized problem and just checked my nfs links. ??? The file systems on the machines exporting the NFS shares did not become corrupted, only data written to files by NFS client machines got corrupted. You are not seeing any issues?
-- --Moby They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety. -- Benjamin Franklin -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* Moby
On 05/26/2012 07:33 AM, Patrick Shanahan wrote:
* Moby
[05-26-12 00:04]: Just a word of caution in case anyone else is boldly tumbling through tumbleweed. I updated to kernel-default-3.4.0-x from the tumbleweed repository yesterday. The machines are primarily used as "nfs servers" and the nfs exports are mounted for use by vmware vSphere hosts. Today, there was massive disk corruption on all files being written to the nfs shares. Not sure if anyone else is seeing this but just something to keep in mind before using the Tumbleweed repo. I am running 3.4.x for two days now w/o a recognized problem and just checked my nfs links. ??? The file systems on the machines exporting the NFS shares did not become corrupted, only data written to files by NFS client machines got corrupted. You are not seeing any issues?
That is correct. I run an rdiff backup of a genealogy system installed on my wife's win7 box wirelessly over nfs every night to an external usb2 disk attached to an openSUSE Tumbleweed box and see no problem/corruption with the newer/changed files. And I saw a post from Gregg with questions on opensuse-factory where Tumbleweed is discussed. I suggest moving discussion there. You may have other problems or I may have just been "lucky". I am Irish :^) gud luk, -- (paka)Patrick Shanahan Plainfield, Indiana, USA HOG # US1244711 http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 http://en.opensuse.org openSUSE Community Member Registered Linux User #207535 @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Patrick Shanahan wrote:
* Moby
[05-26-12 00:04]: The machines are primarily used as "nfs servers" and the nfs exports are mounted for use by vmware vSphere hosts. Today, there was massive disk corruption on all files being written to the nfs shares. That is correct. I run an rdiff backup of a genealogy system installed on my wife's win7 box wirelessly over nfs every night to an external usb2 disk attached to an openSUSE Tumbleweed box and see no problem/corruption with the newer/changed files.
And I saw a post from Gregg with questions on opensuse-factory where Tumbleweed is discussed. I suggest moving discussion there. ==== BTW == how is the read/write speed on NFS these days?
I haven't benched NFS4, but several years back I benched samba against NFS over TCP and Samba was easily 2-3x faster. With Win7, I have benched Samba at 119MB/s reads, and 125MB/s writes (sustained LARGE linear writes). (It easily beats out most file transfer for max speed(was looking for a fast way to transfer large files.. other applications may have different priorities)... I didn't bench for small linear (or random) writes, so maybe NFS is better in those areas, dunno...but with NFS4, I was wondering if it was any better or if anyone had done some "objective", optimized tests of both (I tried UDP&TCP with NFS as well as varying the read/write size to find the fastest numbers). With Samba, on my current test setup a 16MB transfer size is optimal -- which makes a bit of sense, considering the ethernet cards are running 9000 byte packets (9014-header) and have a local max cache of 2000 buffers->18,000,000 -- so 16MB would be the largest multiple of 2 I could dump into the card's buffers at one time and allow it's TCP offload to take care of the buffer while I read another. I.e. if I only had a 1M buffer, on the cards, or ran 1500 byte packets and the limit was still 2k, then my optimal size might be ~2MB/buffer. A bit annoying at the ignorance shown by various engineers -- thinking that 4k writes are still optimal -- when I asked for a increased buffer size on Tbird (and FFox, I was told that the network only transfers a max of 1500 bytes/packet, so anything more than that was a waste. Sigh... What can you say in the face of such awesome intellect?! *urk* -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-06 01:14, Linda Walsh wrote:
I.e. if I only had a 1M buffer, on the cards, or ran 1500 byte packets and the limit was still 2k, then my optimal size might be ~2MB/buffer.
A bit annoying at the ignorance shown by various engineers -- thinking that 4k writes are still optimal -- when I asked for a increased buffer size on Tbird (and FFox, I was told that the network only transfers a max of 1500 bytes/packet, so anything more than that was a waste. Sigh... What can you say in the face of such awesome intellect?! *urk*
I don't think you can use those large transfer over internet. A local ethernet, I dunno. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/OpXAACgkQIvFNjefEBxo98ACg2yCs5VKnecxADCTao9siQ2yw gb0AnjezOi4xOcAUib4oSXLNMHLZs24h =Ydhv -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 6/5/2012 4:14 PM, Linda Walsh wrote:
A bit annoying at the ignorance shown by various engineers -- thinking that 4k writes are still optimal -- when I asked for a increased buffer size on Tbird (and FFox, I was told that the network only transfers a max of 1500 bytes/packet, so anything more than that was a waste.
Once it gets off your network, those assumptions are probably true. There was a series of article on Bufferbloat that brought this issue to light with regard to connections over the internet. http://gettys.wordpress.com/2010/12/03/introducing-the-criminal-mastermind-b... You might be able to claim that 16M transfer size is optimal in your environment, but all it takes is replacing one switch in your network to have that idea shattered. Getting data to your Ethernet card's buffer is but the first step, and it is a step handled by a processor with a lot of horsepower, backed by a lot of ram. From then on stuff gets handled by devices with vastly smaller buffers, and vastly weaker processors. -- _____________________________________ ---This space for rent--- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
I don't think you can use those large transfer over internet. A local ethernet, I dunno.
***Usually***, one uses NFS/SMB on a local network. For the internet, one uses HTTP/FTP SSH...etc. While it is possible to configure NFS/SMB to work over the internet, they were designed for *local network file sharing*. How many Internet sites can you point at that allow the general public to connect/download with NFS or SMB?
- -- Cheers / Saludos,John Andersen wrote: On 6/5/2012 4:14 PM, Linda Walsh wrote:
A bit annoying at the ignorance shown by various engineers -- thinking that 4k writes are still optimal -- when I asked for a increased buffer size on Tbird (and FFox, I was told that the network only transfers a max of 1500 bytes/packet, so anything more than that was a waste.
Once it gets off your network, those assumptions are probably true. There was a series of article on Bufferbloat that brought this issue to light with regard to connections over the internet. http://gettys.wordpress.com/2010/12/03/introducing-the-criminal-mastermind-b...
---- I read those articles....while the articles made some sense, I'm not sure the problem was as bad as the article portrayed it. While you can't do 9K packets and might even be constrained to <1500 bytes if you are running IPV6 or in a VPN, it's still the case that a LARGE buffer of 256K-1M allowing a TCP window size of *similar*, is absolutely necessary for most network applications to get any speed. Look at your ping times say, google or youtube. I get relatively fast times (I think -- I've seen alot worse >50 on ISDN, 30-40 on DSL, seeing low-mid 20's on my cable). 20ms/packet means if you only send 1 packet out at a time (1500 bytes max), you'd get 50 packets in 1 second -- 50*1500 = 75,000 bytes(/1024) = 73.2KB/s. Most people would say that 73KB/s sucks for a download speed. 7-8 years ago commercial speeds in Europe to the home were up to 10-15Mb or 1.25-1.8MB/s. (25 times the 1 packet/ping time). Right now, I am in a slowish area and also can't pay premium, but get up to 22.5Mb down, which NOT counting overhead would get me about 2.5-2.8MB/s ... or about 35* what is a relatively low ping time. Now I know people out there who get 50-100Mb/s or 60-120MB/s... call it 73.2MB/s for round numbers. That's 1000 times faster than what you would get if you only send 1 packet at a time and wait for it's acknowledgment. To get those speeds, you have to be willing to send out 1000 packets, or 1500000 lets use rounding... 1.5MB buffers allowing for a 1.5MB sliding window size. So even over the internet, if you have a FAST connection, 2MB buffers wouldn't be unreasonable. At my piddly rate, a .5M buffer would be enough -- This is the main reason why ISP and others HAVE BUFFERS -- because of lame app writers as mentioned earlier. If the ISP's waited for the other end to acknowledge receipt, on 4K writes... I'd get 30th of my possible performance (this is a major reason why NFS and SMB are NOT used for wide area networks)... Lame app writers are more of a problem than buffer bloat. Check out your network activity with wireshark and check out how many apps use baby-writes. It was only a few years ago, that the linux 'cp' program still limited itself to 4k writes. I think it has improved. Now if you are writing to a LOCAL hard disk -- just 1 new Hard disk can exceed 100MB/s. So fragmentation and small writes can REALLY kill performance. If you are running a RAID or an SSD, speeds of 400-1024MB/s are not uncommon. You can't afford inter-packet latency -- at all!. Does that explain why buffers and read/write sizes NEED to be 1MB or more for most applications -- NOT that most apps can USE that (I mean how much space is consumed by an email sig? )... but for optimal speeds... larger buffs and i/o sizes are vital... -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-08 03:31, Linda Walsh wrote:
Carlos E. R. wrote:
I don't think you can use those large transfer over internet. A local ethernet, I dunno. ---- ***Usually***, one uses NFS/SMB on a local network.
For the internet, one uses HTTP/FTP SSH...etc.
While it is possible to configure NFS/SMB to work over the internet, they were designed for *local network file sharing*.
How many Internet sites can you point at that allow the general public to connect/download with NFS or SMB?
Do the ethernet card internal cache consider that it is transmitting data for NFS? That's another layer, it doesn't know. The cache would be the same regardless of the application. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/RWXkACgkQIvFNjefEBxr7AQCg1QiXVQcSneKfOTIOePJOazGO ag8An12wYvnQ6c4x9qe/XUBS00VVVHvF =G0TV -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Linda Walsh wrote:
Carlos E. R. wrote:
I don't think you can use those large transfer over internet. A local ethernet, I dunno.
***Usually***, one uses NFS/SMB on a local network.
For the internet, one uses HTTP/FTP SSH...etc.
While it is possible to configure NFS/SMB to work over the internet, they were designed for *local network file sharing*.
Just not true. See http://tools.ietf.org/html/rfc1094 - complex failures in routed networks, configuration parameters for low-speed links etc. It is specifically designed to run over just about any kind of network. (BTW, the RFC also discusses buffer sizes and in particular that it is expected that the IP layer will do packet fragmentation and reassembly) The problem with use on a public network was security and uid/gid management. For use over wide-area links, another problem for most was lack of bandwidth. But it was certainly used from the early days over wide-area private networks in banks and other financial institutions. I remember setting up systems with servers in London and clients in New York. We 'stole' bandwidth from the private telephone connection, since there was so much greater phone bandwidth than data bandwidth available. Kind of ironic that I have an Ethernet phone on my desk now.
How many Internet sites can you point at that allow the general public to connect/download with NFS or SMB?
A complete non sequitur. Current day implementation statistics say nothing at all about the original design. How many cars use wheels carved from stone? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 6/7/2012 6:31 PM, Linda Walsh wrote:
Carlos E. R. wrote:
I don't think you can use those large transfer over internet. A local ethernet, I dunno.
***Usually***, one uses NFS/SMB on a local network.
For the internet, one uses HTTP/FTP SSH...etc.
While it is possible to configure NFS/SMB to work over the internet, they were designed for *local network file sharing*.
How many Internet sites can you point at that allow the general public to connect/download with NFS or SMB?
But you've missed my main point! The buffer size in your local nic has nothing what so ever to do with the biffer size in every switch, hub, and router between your server and the workstation. Even if you successfully force your nic to use very large packets (which, by the way, is a totally separate issue than the size of the buffer memory on the card), you typically have NO control of the packet size on the switches, routers, and hubs between server and workstation. Unless you can force those network components to large packet sizes, you gain nothing. In fact, you force a vastly weaker processor (in the switch) to re-packetize these huge packets which you postulate you can induce your nic to transmit. -- _____________________________________ ---This space for rent--- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
El 26/05/12 00:02, Moby escribió:
Just a word of caution in case anyone else is boldly tumbling through tumbleweed. I updated to kernel-default-3.4.0-x from the tumbleweed repository yesterday. The machines are primarily used as "nfs servers" and the nfs exports are mounted for use by vmware vSphere hosts. Today, there was massive disk corruption on all files being written to the nfs shares. Not sure if anyone else is seeing this but just something to keep in mind before using the Tumbleweed repo.
Please fill a critical bug report against the kernel, that should not happend, data corruption is a major-ass bug. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (7)
-
Carlos E. R.
-
Cristian Rodríguez
-
Dave Howorth
-
John Andersen
-
Linda Walsh
-
Moby
-
Patrick Shanahan