https://bugzilla.novell.com/show_bug.cgi?id=881912 https://bugzilla.novell.com/show_bug.cgi?id=881912#c0 Summary: Soft lockup in utimes_common on ext3 with quota Classification: openSUSE Product: openSUSE 13.1 Version: Final Platform: i686 OS/Version: openSUSE 13.1 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: jimc@math.ucla.edu QAContact: qa-bugs@suse.de Found By: --- Blocker: --- Kernel version: kernel-default-3.14.1-24.3.geafcebd.i586 (Tumbleweed) Rest of OS: OpenSuSE 13.1 Affected filesystem: ext3 on RAID-5 (Megaraid) accessed locally, with user (but not group) quotas. Some of our users have bloated system mailboxes, 500Mbyte and up, and configure their mailers to expunge (via IMAP) on every deletion, requiring copying ridiculous amounts of data frequently, whereupon they complain about sluggish response. The IMAP daemon (WU-imap) is accessing the maibox file locally. Nothing ever accesses the mailbox over NFS, although unrelated files in the same filesystem (non-SuSE software) may be read by clients via NFS, not much traffic. Home directories, in other filesystems, are served read-write to clients. Also, user sessions on this machine may access files on other hosts by NFS, usually reading, sometimes writing. All mounting is done by autofs. Formerly the clients mounted using Nfsvers=3 in /etc/nfsmount.conf, and hard mounts (and similarly for this server when mounting other hosts as a client). We switched to Nfsvers=4 and soft mounts. Very soon thereafter, we started getting soft lockups. It was always in utime(2) called from imapd, in code where a spinlock is plausible. But it must be noted that almost all of the I/O activity and almost all of the calls to utime(2) were done by imapd. After five of these over 2 days we reverted to the status quo ante (Nfsvers=3, hard mounts), and no further soft lockups were seen over 2 days; however it was the weekend and the imapd activity was quite a bit less. If another soft lockup is seen, I'll post it (and most likely we will go back to Nfsvers=4 and soft mounts). I really doubt that the hard vs. soft mount issue is relevant here, but the NFS versions use completely different code and would be very relevant. Except that the affected files are never accessed by NFS; the bulk of NFS activity is on filesystems other than this one. Why are we running the Tumbleweed kernel? See Bug 880599; a fix for that problem is in the newer kernel. But the distro's standard kernel receives fixes that Tumbleweed doesn't (or it looks that way), so if bug 880599 were resolved we would revert to kernel 3.11.10. Since I can't make the soft lockup happen on command, and since it will be hard to test patched kernels on the production system, I don't have much hope for making progress suppressing the soft lockups. I'm mainly posting this bug report to have it on record that something is happening, in case someone else (or we) discovers a way to test fixes efficiently, or spots a coding error in utimes(2). Also, this issue is off topic for bug 880599 but I plan to insert a cross-reference there, in case there actually is some connection. (Note to readers: a soft lockup is a detected infinite loop in kernel code, and a very common culprit is using a spinlock when the holder fails to release the lock.) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.