[Bug 475998] New: smbd hangs at 100% CPU and is unkillable
https://bugzilla.novell.com/show_bug.cgi?id=475998 Summary: smbd hangs at 100% CPU and is unkillable Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: i686 OS/Version: openSUSE 11.1 Status: NEW Severity: Critical Priority: P5 - None Component: Samba AssignedTo: samba-maintainers@SuSE.de ReportedBy: markus@gaugusch.at QAContact: samba-maintainers@SuSE.de Found By: --- User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.0.5) Gecko/2008121300 SUSE/3.0.5-1.1 Firefox/3.0.5 I have installed a new Server with openSUSE 11.1. It is also using Samba (member of an AD domain). Every 1-2 weeks, I experience an unkillable smbd process, utilizing 100% CPU (only one core, the other is still idle). I found the following post on LKML which may be the cause: http://lkml.org/lkml/2008/12/15/109 (bug in 2.6.27.x where x<10, causes unkillable processes when using inotify) The machine cannot be rebooted cleanly sometimes (hangs during shutdown). This is a major bug, but hopefully I have already provided the solution :) Reproducible: Sometimes -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User jmcdonough@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c1 James McDonough <jmcdonough@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |samba-maintainers@SuSE.de AssignedTo|samba-maintainers@SuSE.de |gregkh@novell.com --- Comment #1 from James McDonough <jmcdonough@novell.com> 2009-02-15 06:10:07 MST --- Greg, does this make sense that it could be this same bug? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User gregkh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c2 Greg Kroah-Hartman <gregkh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |markus@gaugusch.at --- Comment #2 from Greg Kroah-Hartman <gregkh@novell.com> 2009-02-15 11:09:19 MST --- Yes, we have had this reported before. Can you please try the latest kernel update for opensuse 11.1 and tell us if you still have the problem there? It should already be resolved. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User markus@gaugusch.at added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c3 Markus Gaugusch <markus@gaugusch.at> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|markus@gaugusch.at | --- Comment #3 from Markus Gaugusch <markus@gaugusch.at> 2009-02-15 13:46:58 MST --- Hi, I'm using kernel-pae-2.6.27.7-9.1 with the following uname info: Linux ds9 2.6.27.7-9-pae #1 SMP 2008-12-04 18:10:04 +0100 i686 i686 i386 GNU/Linux If I'm not wrong, this is the original kernel. I can't find any kernel updates for opensuse 11.1 at http://download.opensuse.org/update/11.1/rpm/i586/ (or i686), so I think this is the latest kernel. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User kristjan.eentsalu@err.ee added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c4 Kristjan Eentsalu <kristjan.eentsalu@err.ee> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kristjan.eentsalu@err.ee --- Comment #4 from Kristjan Eentsalu <kristjan.eentsalu@err.ee> 2009-02-17 00:52:23 MST --- Hello I'm experiencing the same problem(samba is domain member), I'm using kernel 2.6.27.8-11-xen(had the same problem with 2.6.27.7-9-xen) from http://download.opensuse.org/repositories/Kernel:/SL111_BRANCH/openSUSE_11.1... and I have to destroy my xen domU because reboot does not work, it just hangs. I only discovered that the culprit was smbd when I changed my domU conf to use 2 virtual cpus, smbd took 100% and the other virtual cpu was idle.With one virtual cpu domU just hangs with this bug, I can ping it, but I can not access my domU, neither with ssh or with "xm console mydomU" and "xm top" shows that domU is using 100% cpu, nothing in domU's logs either, it just stops.... Even tried with samba 3.3 from http://download.opensuse.org/repositories/network:/samba:/TESTING/openSUSE_1... and still have the same problem.So this bug is a real show stopper. Also found another person with the same problem in the samba mailing list: http://lists.samba.org/archive/samba-technical/2009-February/063217.html Kristjan -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User cedric@solucionjava.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c5 Cedric Simon <cedric@solucionjava.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |cedric@solucionjava.com --- Comment #5 from Cedric Simon <cedric@solucionjava.com> 2009-02-17 18:07:28 MST --- Bug confirmed as specified in http://lists.samba.org/archive/samba-technical/2009-February/063217.html We tried with Samba version from openSuse repository, then we downgrade to version 3.2.6, same problem. Only (quick) solution seems to be a downgrade to openSuse 11.0. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User gregkh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c6 Greg Kroah-Hartman <gregkh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |markus@gaugusch.at --- Comment #6 from Greg Kroah-Hartman <gregkh@novell.com> 2009-02-17 20:42:37 MST --- (In reply to comment #3)
If I'm not wrong, this is the original kernel. I can't find any kernel updates for opensuse 11.1 at http://download.opensuse.org/update/11.1/rpm/i586/ (or i686), so I think this is the latest kernel.
For some reason, the opensuse kernel hasn't been updated, I'll go poke people about this. You can get the latest version in our kernel-of-the-day repository, which can be downloaded from: http://ftp.suse.com/pub/projects/kernel/kotd/SL111_BRANCH/ Please try the kernel there, this should be resolved with those updates. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User andrew@silverstream.net.nz added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c7 Andrew Walters <andrew@silverstream.net.nz> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |andrew@silverstream.net.nz --- Comment #7 from Andrew Walters <andrew@silverstream.net.nz> 2009-02-24 12:51:39 MST --- I've experienced this bug as described above too. Initially it tied up 100% of 1 of the 8 cores on this server. Within 18 hours, the server was unreachable. Will add info once I've got physical access to the affected machine. Thanks, Andrew -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User andrew@silverstream.net.nz added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c8 --- Comment #8 from Andrew Walters <andrew@silverstream.net.nz> 2009-02-24 14:56:19 MST --- Hi again, Nothing in the logs of said server prior to the crash that seems to add any useful info, sorry. The logs just stop dead. I'm guessing runaway unkillable smbd processes took out each core one by one before locking up the server (4x Intel Xeon E5420 (Dual Core 2.5GHz), 8 cores total). First up, it appears that there is no SL111_BRANCH as linked above. I see SL11_BRANCH and SLE11_BRANCH, but nothing (apparently) for 11.1. Do we use one of the other ones? Or, is there an update on when the updated kernel will be released? TIA Andrew -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User andrew@silverstream.net.nz added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c9 --- Comment #9 from Andrew Walters <andrew@silverstream.net.nz> 2009-02-24 14:57:34 MST --- sorry correction to above server specs, it's 2x quad core xeon, not 4x dual core. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User cedric@solucionjava.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c10 --- Comment #10 from Cedric Simon <cedric@solucionjava.com> 2009-02-24 15:13:53 MST --- We migrated to kernetl 2.6.27.17-SL111_BRANCH_20090216170504_954248e4-pae downloaded from http://ftp.suse.com/pub/projects/kernel/kotd/SL111_BRANCH/ Work fine. No more problem with Samba ! Please note the compiled kernel on the link are OK, but not the source --> if source is needed (NVidia driver to recompile for example), it won't work. But as I didn't need the source in prod, I am happy now! Regards, Cedric. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User andrew@silverstream.net.nz added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c11 --- Comment #11 from Andrew Walters <andrew@silverstream.net.nz> 2009-02-24 16:51:32 MST --- Hi Cedric, Can you confirm the link above? It returns a 404 for me. Additionally I don-t see a SL111-anything under http://ftp.suse.com/pub/projects/kernel/kotd/ or ftp://ftp.suse.com/pub/projects/kernel/kotd/. Thanks, Andrew -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User gregkh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c12 --- Comment #12 from Greg Kroah-Hartman <gregkh@novell.com> 2009-02-24 16:59:47 MST --- The SL111 branch has now been merged with the SLE11 branch, so use that kernel instead, it will have the same fix, and will be what the updated kernel will be as well, whenever it gets out of the pipeline... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User abittner@stud.fh-heilbronn.de added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c13 andreas bittner <abittner@stud.fh-heilbronn.de> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |abittner@stud.fh-heilbronn. | |de --- Comment #13 from andreas bittner <abittner@stud.fh-heilbronn.de> 2009-02-25 02:49:43 MST --- hi there, i have the very same bug it seems, and have just found this bug and the discussion: http://www.nabble.com/Process-smbd-using-100--CPU-and-imposible-to-kill-to22... i only have a small workgroup (win2k, winxp machines) and this samba server hosting just a simple share for all other machines to use same username/password credentials. the machine was running a few days and today i see 100% load on one smbd process (dualcore machine) cant kill smbd with kill -9 either smbd still answers for requests it seems as of now, and smbstatus -L also displays various users and files in access the pid which causes 100% core load is a different pid which is being displayed in the smbstatus -L output. its lower. so i guess its some earlier smbd task or the main daemon or whatever. Linux tux 2.6.27.7-9-pae #1 SMP 2008-12-04 18:10:04 +0100 i686 athlon i386 GNU/Linux samba-client-3.2.6-0.3.1 yast2-samba-client-2.17.11-1.30 yast2-samba-server-2.17.6-1.2 samba-3.2.6-0.3.1 please deliver a fix for this via online_update. thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User abittner@stud.fh-heilbronn.de added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c14 --- Comment #14 from andreas bittner <abittner@stud.fh-heilbronn.de> 2009-02-25 03:01:40 MST --- btw: how do i apply the kernel of the day rpms to my system exactly? is there some howto, basic explanation of the various packages or best-practices material or something anywhere? currently i have: kernel-pae-2.6.27.7-9.1 kernel-pae-base-2.6.27.7-9.1 kernel-pae-extra-2.6.27.7-9.1 i suppose i need to get the three rpm files: http://ftp.suse.com/pub/projects/kernel/kotd/SLE11_BRANCH/i586/kernel-pae-2.... http://ftp.suse.com/pub/projects/kernel/kotd/SLE11_BRANCH/i586/kernel-pae-ba... and http://ftp.suse.com/pub/projects/kernel/kotd/SLE11_BRANCH/i586/kernel-pae-ex... and apply them with rpm -Uvh --force --nodeps kernel-pae* is this the proper procedure? i have software-raid1 on this machine /dev/md0 and /dev/md1), with two simple sata disks. do i need to mess with initrd or something after applying with rpm? thanks for helping. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User abittner@stud.fh-heilbronn.de added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c15 --- Comment #15 from andreas bittner <abittner@stud.fh-heilbronn.de> 2009-02-25 04:15:08 MST --- i cant even reboot this machine any more (remote). its simply not killing its processes (smbd i suppose) it has a lod of 10.x by now, it had load of 1.0 when only the smbd was causing 100% cpu/core load :( top - 12:13:27 up 3 days, 22:49, 1 user, load average: 10.98, 8.76, 5.74 Tasks: 86 total, 4 running, 80 sleeping, 0 stopped, 2 zombie Cpu0 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 1.7%sy, 0.0%ni, 96.3%id, 2.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4101144k total, 2731008k used, 1370136k free, 145924k buffers Swap: 2096460k total, 0k used, 2096460k free, 2161256k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13240 root 20 0 16456 3980 3028 R 100 0.1 1145:40 smbd 7 root 15 -5 0 0 0 R 0 0.0 0:03.48 events/0 18 root 15 -5 0 0 0 R 0 0.0 0:04.14 kondemand/0 :( yet another huge and extremly annoying opensuse bug. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User gregkh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c16 --- Comment #16 from Greg Kroah-Hartman <gregkh@novell.com> 2009-02-25 12:16:28 MST --- (In reply to comment #14)
btw: how do i apply the kernel of the day rpms to my system exactly? is there some howto, basic explanation of the various packages or best-practices material or something anywhere?
currently i have: kernel-pae-2.6.27.7-9.1 kernel-pae-base-2.6.27.7-9.1 kernel-pae-extra-2.6.27.7-9.1
i suppose i need to get the three rpm files: http://ftp.suse.com/pub/projects/kernel/kotd/SLE11_BRANCH/i586/kernel-pae-2.... http://ftp.suse.com/pub/projects/kernel/kotd/SLE11_BRANCH/i586/kernel-pae-ba... and http://ftp.suse.com/pub/projects/kernel/kotd/SLE11_BRANCH/i586/kernel-pae-ex...
Yes.
and apply them with rpm -Uvh --force --nodeps kernel-pae*
rpm -Uhv kernel-pae* should be all you need, no need for --force or --nodeps.
is this the proper procedure?
i have software-raid1 on this machine /dev/md0 and /dev/md1), with two simple sata disks. do i need to mess with initrd or something after applying with rpm?
The initrd will be regenerated automatically. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User cedric@solucionjava.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c17 --- Comment #17 from Cedric Simon <cedric@solucionjava.com> 2009-02-26 09:24:29 MST --- Kernel 2.6.27.19-3.2 release via YAST update tonight. As this kernel is > 2.6.27.17 that solved the problem, I assume bug is still fixed and resolved by YAST update. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User gregkh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c18 Greg Kroah-Hartman <gregkh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED Info Provider|markus@gaugusch.at | Resolution| |FIXED --- Comment #18 from Greg Kroah-Hartman <gregkh@novell.com> 2009-02-26 09:36:28 MST --- (In reply to comment #17)
Kernel 2.6.27.19-3.2 release via YAST update tonight.
As this kernel is > 2.6.27.17 that solved the problem, I assume bug is still fixed and resolved by YAST update.
Yes, it should now be. I'm going to close this report out, if there are any new problems, please open new bugs. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User abittner@stud.fh-heilbronn.de added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c19 --- Comment #19 from andreas bittner <abittner@stud.fh-heilbronn.de> 2009-02-26 11:23:23 MST --- will online_update offer me this official kernel as i went for the kotd one? or do i need to download those 3 pae files for the official kernel tonite then and manually apply them the same way as the kotd one? thanks and regards. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User gregkh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c20 --- Comment #20 from Greg Kroah-Hartman <gregkh@novell.com> 2009-02-26 11:30:28 MST --- (In reply to comment #19)
will online_update offer me this official kernel as i went for the kotd one?
or do i need to download those 3 pae files for the official kernel tonite then and manually apply them the same way as the kotd one?
I do not know, it depends on what the kernel version is of the ones you downloaded and installed by-hand is. Either way you should be fine. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User abittner@stud.fh-heilbronn.de added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c21 --- Comment #21 from andreas bittner <abittner@stud.fh-heilbronn.de> 2009-02-26 11:36:53 MST --- its up there in comment #14 rpm -aq | grep -i kernel kernel-pae-2.6.27.19-SLE11_BRANCH_20090224184624_e3d5b110 kernel-pae-base-2.6.27.19-SLE11_BRANCH_20090224184624_e3d5b110 kernel-pae-extra-2.6.27.19-SLE11_BRANCH_20090224184624_e3d5b110 thats what the box is running atm. thanks and regards. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=475998 User abittner@stud.fh-heilbronn.de added comment https://bugzilla.novell.com/show_bug.cgi?id=475998#c22 --- Comment #22 from andreas bittner <abittner@stud.fh-heilbronn.de> 2009-02-26 11:55:27 MST --- just triggered yast2 online_update on that machine, and it offered me the new kernel entry. the machine came back up after reboot as: Linux tux 2.6.27.19-3.2-pae #1 SMP 2009-02-25 15:40:44 +0100 i686 athlon i386 GNU/Linux seems to be okay. now time will tell if smbd and other processes will behave nicely and if this bug has been fixed for real :) thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com