[Bug 387202] New: nscd keeps crashing in mem.c
https://bugzilla.novell.com/show_bug.cgi?id=387202 Summary: nscd keeps crashing in mem.c Product: openSUSE 11.0 Version: Factory Platform: Other OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: mmarek@novell.com QAContact: qa@suse.de Found By: --- Hi, on my machine, nscd always crashes with an assertion after some time: 25033: handle_request: request received (Version = 2) from PID 25253 25033: GETFDPW 25033: provide access to FD 5, for passwd 25033: Reloading "0" in password cache! 25033: Reloading "10020" in password cache! 25033: remove GETPWBYNAME entry "mmarek" 25033: remove GETPWBYUID entry "10020" nscd: mem.c:399: gc: Assertion `next_hash == &he[db->head->nentries]' failed. or 25991: handle_request: request received (Version = 2) from PID 26107 25991: GETPWBYNAME (nobody) 25991: Haven't found "nobody" in password cache! 25991: Reloading "mmarek" in password cache! 25991: remove GETPWBYNAME entry "mmarek" 25991: remove GETPWBYUID entry "10020" nscd: mem.c:392: gc: Assertion `off_alloc == off_allocend' failed. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User mmarek@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c1 --- Comment #1 from Michal Marek <mmarek@novell.com> 2008-05-06 07:27:19 MST --- Created an attachment (id=212695) --> (https://bugzilla.novell.com/attachment.cgi?id=212695) nscd log log output from the last run. I did rm /var/run/nscd/* /usr/sbin/nscd -d 2>&1 | tee log-nscd -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User mmarek@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c4 Michal Marek <mmarek@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.provo.novell.com |pbaudis@novell.com --- Comment #4 from Michal Marek <mmarek@novell.com> 2008-05-06 07:31:27 MST --- Petr? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c5 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |mmarek@novell.com --- Comment #5 from Petr Baudis <pbaudis@novell.com> 2008-06-25 17:30:38 MDT --- Hmm, do you still encounter this with the 11.0 nscd? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User mmarek@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c6 Michal Marek <mmarek@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|mmarek@novell.com | --- Comment #6 from Michal Marek <mmarek@novell.com> 2008-06-26 06:14:42 MDT --- Yes. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User mmarek@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c8 --- Comment #8 from Michal Marek <mmarek@novell.com> 2008-06-26 06:16:34 MDT --- $ rpm -q nscd nscd-2.8-15 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c9 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |casualprogrammer@gmail.com --- Comment #9 from Petr Baudis <pbaudis@novell.com> 2008-06-26 18:38:11 MDT --- *** Bug 388435 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=388435 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c10 --- Comment #10 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2008-07-03 05:08:09 MDT --- Created an attachment (id=225788) --> (https://bugzilla.novell.com/attachment.cgi?id=225788) nscd core file I too am seeing many nscd crashes, sometimes every few minutes, and this stops Thunderbird working. I have attached a core file: is there any other information that would be useful? I am also happy to test any fixes that might be available. nscd is version 2.8-14.1 running on Opensuse 11.0 x86_64. Bob -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c11 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #11 from Petr Baudis <pbaudis@novell.com> 2008-07-03 17:04:49 MDT --- I can reproduce this myself, just so far didn't figure out what the bug is. I'm still working on it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 Marton Balint <cus@fazekas.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- Blocks| |266219 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jnelson-suse@jamponi.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c12 Jon Nelson <jnelson-suse@jamponi.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jnelson-suse@jamponi.net --- Comment #12 from Jon Nelson <jnelson-suse@jamponi.net> 2008-07-20 07:05:52 MDT --- Does this help?
From /var/log/nscd.log (enabled by hand):
17429: pruning services cache; time 1216524649 17429: considering GETSERVBYPORT entry "`nɑ/tcp", timeout 1216552141 17429: considering GETSERVBYPORT entry " 372^K211e^?/tcp", timeout 1216552130 17429: considering GETSERVBYPORT entry "@/خ/tcp", timeout 1216551996 17429: considering GETSERVBYPORT entry " 272L?354^?/tcp", timeout 1216552070 17429: considering GETSERVBYPORT entry " 332f8343^?/tcp", timeout 1216552142 17429: considering GETSERVBYPORT entry " 252372rU^?/tcp", timeout 1216551945 17429: considering GETSERVBYPORT entry " e5301/tcp", timeout 1216552050 17429: considering GETSERVBYPORT entry "0214247^A/tcp", timeout 1216552119 17429: considering GETSERVBYPORT entry " 312Q=i^?/tcp", timeout 1216552080 17429: considering GETSERVBYPORT entry " ^ZQ356^C^?/tcp", timeout 1216552070 17429: considering GETSERVBYNAME entry "netbios-ns/tcp", timeout 1216552463 17429: considering GETSERVBYNAME entry "bootps/udp", timeout 1216552463 17429: considering GETSERVBYPORT entry "", timeout 1216551905 17429: considering GETSERVBYPORT entry "220u=O/tcp", timeout 1216552098 17429: considering GETSERVBYPORT entry " 272237303^?^?/tcp", timeout 1216552087 17429: considering GETSERVBYPORT entry " 352317/,^?/tcp", timeout 1216552080 17429: considering GETSERVBYPORT entry " :_^^352^?/tcp", timeout 1216551945 17429: considering GETSERVBYNAME entry "ipp/udp", timeout 1216552463 17429: considering GETSERVBYPORT entry "260316317^K/tcp", timeout 1216552087 17429: considering GETSERVBYPORT entry "`fIESC/tcp", timeout 1216552113 17429: considering GETSERVBYPORT entry "@@347^X/tcp", timeout 1216552391 17429: considering GETSERVBYPORT entry "320315301313/tcp", timeout 1216552087 17429: considering GETSERVBYPORT entry "321^B", timeout 1216551905 17429: considering GETSERVBYPORT entry "pVK261/tcp", timeout 1216552050 17429: considering GETSERVBYPORT entry "^P!365(/tcp", timeout 1216552391 17429: considering GETSERVBYPORT entry " *323 247^?/tcp", timeout 1216552391 17429: considering GETSERVBYPORT entry " 212kb267^?/tcp", timeout 1216551934 17429: considering GETSERVBYNAME entry "netbios-ssn/tcp", timeout 1216552463 17429: considering GETSERVBYPORT entry "^Pr^E240/tcp", timeout 1216552130 17429: considering GETSERVBYPORT entry "@322^FH/tcp", timeout 1216552097 17429: considering GETSERVBYPORT entry " ʧuI^?/tcp", timeout 1216552113 17429: considering GETSERVBYPORT entry " 272255^CM^?/tcp", timeout 1216552087 17429: considering GETSERVBYPORT entry " ʻ205367^?/tcp", timeout 1216551905 .. and then it dies a little bit later. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 Michal Marek <mmarek@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Found By|--- |Development -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User james.faulkner@yale.edu added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c13 --- Comment #13 from James Faulkner <james.faulkner@yale.edu> 2008-08-18 10:38:03 MDT --- Created an attachment (id=233949) --> (https://bugzilla.novell.com/attachment.cgi?id=233949) NSCD debug log I am also seeing this bug on 2 systems which use LDAP account information from a RHEL 5 server. I'm attaching the first system's nscd debug log. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User james.faulkner@yale.edu added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c14 --- Comment #14 from James Faulkner <james.faulkner@yale.edu> 2008-08-18 10:39:26 MDT --- Created an attachment (id=233951) --> (https://bugzilla.novell.com/attachment.cgi?id=233951) NSCD debug log 2 the 2nd system's NSCD debug log. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User james.faulkner@yale.edu added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c15 --- Comment #15 from James Faulkner <james.faulkner@yale.edu> 2008-08-18 10:43:56 MDT --- NSCD is pretty critical for reducing the load on my LDAP server. I would be happy to run some test cases or debug code for you if you want. I have no trouble crashing nscd very quickly on my OpenSUSE 11.0 systems. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jpschewe@mtu.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c16 --- Comment #16 from Jon Schewe <jpschewe@mtu.net> 2008-08-22 10:49:47 MDT --- I too am having the same problem. I'm using LDAP for account information and kerberos for passwords. I'm seeing nscd crash on all of my servers at least every 15 minutes (I've got a script setup to restart every 5 if it's dead). I'm having this problem both in dom0 and in domU on xen as well as at home on my non-xen systems. I installed libnscd-debuginfo and then ran nscd -d in gdb and got the following: .. 6685: Reloading "103" in password cache! 6685: Reloading "13" in password cache! 6685: Reloading "100" in password cache! 6685: remove GETPWBYUID entry "0" 6685: remove GETPWBYNAME entry "root" nscd: mem.c:399: gc: Assertion `next_hash == &he[db->head->nentries]' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x4103f950 (LWP 6688)] 0x00007fe6907ce5c5 in raise () from /lib64/libc.so.6 (gdb) where #0 0x00007fe6907ce5c5 in raise () from /lib64/libc.so.6 #1 0x00007fe6907cfbb3 in abort () from /lib64/libc.so.6 #2 0x00007fe6907c71e9 in __assert_fail () from /lib64/libc.so.6 #3 0x00007fe691362b68 in ?? () from /usr/sbin/nscd #4 0x00007fe691361494 in ?? () from /usr/sbin/nscd #5 0x00007fe6913582c6 in ?? () from /usr/sbin/nscd #6 0x00007fe690d14040 in start_thread () from /lib64/libpthread.so.0 #7 0x00007fe69086f0cd in clone () from /lib64/libc.so.6 (gdb) Unfortunately it doesn't appear there is a debuginfo package for nscd, so this doesn't help quite as much as I'd hoped. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jpschewe@mtu.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c17 --- Comment #17 from Jon Schewe <jpschewe@mtu.net> 2008-09-04 07:50:53 MDT --- Which debuginfo packages would include the appropriate symbols to be able to get function names from the errors of nscd shown above? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User schuetzm@gmx.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c18 Marc Schütz <schuetzm@gmx.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |schuetzm@gmx.net --- Comment #18 from Marc Schütz <schuetzm@gmx.net> 2008-09-25 06:39:18 MDT --- (In reply to comment #17 from Jon Schewe)
Which debuginfo packages would include the appropriate symbols to be able to get function names from the errors of nscd shown above?
glibc-debuginfo -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jpschewe@mtu.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c19 --- Comment #19 from Jon Schewe <jpschewe@mtu.net> 2008-09-25 07:29:26 MDT --- Thanks. Now I've got a real stack trace to share. Took all of 10 mintues for it to crash this time. GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-suse-linux"... (gdb) run -d Starting program: /usr/sbin/nscd -d [Thread debugging using libthread_db enabled] [New Thread 0x7f8d562276f0 (LWP 818)] [New Thread 0x4102f950 (LWP 821)] [New Thread 0x42112950 (LWP 822)] [New Thread 0x415a7950 (LWP 823)] [New Thread 0x417a8950 (LWP 824)] [New Thread 0x419a9950 (LWP 825)] [New Thread 0x41baa950 (LWP 826)] [New Thread 0x40584950 (LWP 827)] [New Thread 0x40785950 (LWP 828)] 818: Reloading "root" in group cache! 818: remove INITGROUPS entry "root" nscd: mem.c:392: gc: Assertion `off_alloc == off_allocend' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x42112950 (LWP 822)] 0x00007f8d556be5c5 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. in ../nptl/sysdeps/unix/sysv/linux/raise.c (gdb) where #0 0x00007f8d556be5c5 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f8d556bfbb3 in *__GI_abort () at abort.c:88 #2 0x00007f8d556b71e9 in *__GI___assert_fail ( assertion=0x7f8d5625b3d0 "off_alloc == off_allocend", file=0x7f8d5625b379 "mem.c", line=392, function=0x7f8d5625b450 "gc") at assert.c:78 #3 0x00007f8d56252ba6 in gc (db=0x7f8d5645f200) at mem.c:392 #4 0x00007f8d56251494 in prune_cache (table=0x7f8d5645f200, now=1222348776, fd=-1) at cache.c:499 #5 0x00007f8d562482c6 in nscd_run_prune (p=<value optimized out>) at connections.c:1390 #6 0x00007f8d55c04040 in start_thread (arg=<value optimized out>) at pthread_create.c:297 #7 0x00007f8d5575f0cd in clone () from /lib64/libc.so.6 (gdb) print off_alloc $1 = 1436420560 (gdb) print off_allocend $2 = 512 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User kuenne@rentec.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c20 Karsten Kuenne <kuenne@rentec.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kuenne@rentec.com --- Comment #20 from Karsten Kuenne <kuenne@rentec.com> 2008-10-07 19:10:02 MDT --- Looks like Ubuntu has the same bug (#271423). But no solution there either. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User fedev@gmx.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c21 Federico Vecchiarelli <fedev@gmx.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fedev@gmx.net --- Comment #21 from Federico Vecchiarelli <fedev@gmx.net> 2008-10-12 22:23:52 MDT --- For me nscd is particulary important because I'm using it for offline LDAP authentication. So far I'm using a watchdog to restart it when it dies. Opensuse 11.0 x64. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User syseng@adnovum.ch added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c22 Bernd Nies <syseng@adnovum.ch> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |syseng@adnovum.ch Priority|P5 - None |P2 - High --- Comment #22 from Bernd Nies <syseng@adnovum.ch> 2008-10-21 03:19:36 MDT --- We're having the same problem. OpenSUSE 11.0 (i386 and x86_64) with LDAP for authentication and NFS automount tables. It dies frequently within less than an hour. As result one cannot start Thunderbird (segfaults) and VMware Worstation 6.0.5 (freezes) as LDAP user after nscd has died. As local user it works. See also bug#157078 and http://bugs.gentoo.org/show_bug.cgi?id=223205. The only workaround for us is so far a watchdog daemon that restarts nscd every time it crashes. I can provide you a strace of nscd the next time it crashes. Best regards, Bernd -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jnelson-suse@jamponi.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c23 --- Comment #23 from Jon Nelson <jnelson-suse@jamponi.net> 2008-10-21 06:55:53 MDT --- I gave up on nscd and have been using unscd - http://busybox.net/~vda/unscd/ - and it seems to work just great. Last time I checked it had been up for a month. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User grok@tnt.pl added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c24 Jaroslaw Zachwieja <grok@tnt.pl> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |grok@tnt.pl --- Comment #24 from Jaroslaw Zachwieja <grok@tnt.pl> 2008-10-22 06:36:53 MDT --- Created SRPM, minimal testing on 11.0: https://bugzilla.novell.com/show_bug.cgi?id=157078#c73 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Major |Critical Priority|P2 - High |P1 - Urgent -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User admin@fph.physik.uni-karlsruhe.de added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c25 Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |admin@fph.physik.uni-karlsruhe.de --- Comment #25 from Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de> 2008-10-30 03:40:17 MDT --- Seems I ran into the same stability problem. I have 30 PCs running openSuSE 11.0 on 64 bit. I started logging of nscd now. A fix to the problem would be very welcome. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User syseng@adnovum.ch added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c26 --- Comment #26 from Bernd Nies <syseng@adnovum.ch> 2008-10-30 03:56:44 MDT --- Our workaround is a watchdog daemon that restarts nscd: ==CUT== watch_procs="/usr/sbin/nscd" ( while true; do for proc in $watch_procs; do if ! checkproc $proc; then logger -t watchdog "Restarting $proc." start_daemon $proc fi done sleep 60 done ) & ==CUT== Nscd crashes up to eight times daily: adnws001:~ # grep nscd /var/log/messages Oct 27 06:16:19 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 27 08:46:21 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 27 12:46:24 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 27 22:31:36 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 28 04:46:40 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 28 09:36:43 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 28 11:01:44 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 28 13:05:49 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 28 16:34:35 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 28 16:36:36 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 28 17:41:02 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 28 22:40:04 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 29 04:47:06 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 29 11:02:09 adnws001 watchdog: Restarting /usr/sbin/nscd. Oct 29 12:43:10 adnws001 watchdog: Restarting /usr/sbin/nscd. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c27 --- Comment #27 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2008-10-30 04:37:32 MDT --- Luxury! I used the watchdog approach under SuSE 10.2, but under SuSE 11.0 I see nscd crashing every few minutes, or I did before I disabled it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c28 --- Comment #28 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2008-10-31 06:15:03 MDT --- Following the suggestion in https://bugzilla.novell.com/show_bug.cgi?id=387202#c23 and also a private email from Jaroslaw I have installed unscd on a couple of machines. So far it is looking good and if I don't find any problems I'll roll it out to other machines. I would be interested in hearing a comment from SuSE about unscd as it sounds like it could be the solution to a major headache, but SuSE are much better qualified than I to make that judgement. The drawback is that very few people seem to have tested it so far, and it is a very important piece of software that has to work in a wide variety of environments. On the other hand, the standard nscd has been notoriously flakey for many years, so the bar isn't very high! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User admin@fph.physik.uni-karlsruhe.de added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c29 --- Comment #29 from Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de> 2008-10-31 07:08:07 MDT --- I employ now the watchdog-approach from Comment #26. Many Thanks for the code snippet! Just for statistical fun on Halloween: The avarage lifetime of nscd here is 144 minutes/SuSE-11.0-box. (average of crashes on 34 moderately loaded boxes during 14 hours). (Using NIS and DNS, openSuSE 11.0, x86-64, no ldap). Funny enough only in 11 (of 197) crashes there is a log from the kernel in syslog (mostly segfaults, sometimes "general protection"). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User grok@warwick.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c30 --- Comment #30 from Jaroslaw Zachwieja <grok@warwick.ac.uk> 2008-10-31 07:32:35 MDT --- Watchdogs are a spawn of evil. I've rolled out unscd on all 250 desktops now. Fingers crossed (but still keeping the watchdog alive so I can at least catch any potential issues with unscd). Bob, did you disable debugging already? How's performance? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c31 --- Comment #31 from Petr Baudis <pbaudis@novell.com> 2008-11-03 05:51:48 MDT --- FWIW we are very strongly considering unscd for post-11.1, though nothing is decided yet. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c32 --- Comment #32 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2008-11-04 09:39:57 MST --- I am running unscd on several heavily-loaded 11.0 machines now and so far it is looking very good. It hasn't crashed, and every so often I run getent on every account to confirm it is telling the truth (in the past nscd has suffered from corrupt caches as well as segmentation faults). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jnelson-suse@jamponi.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c33 --- Comment #33 from Jon Nelson <jnelson-suse@jamponi.net> 2008-11-04 11:26:07 MST --- *If* you are going to try unscd *and* you are using apparmor, you'll need to edit /etc/apparmor.d/usr.sbin.nscd and right after capability net_bind_service, add: capability setgid, capability setuid, for it to work. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c34 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |robin.listas@telefonica.net --- Comment #34 from Petr Baudis <pbaudis@novell.com> 2008-11-14 03:18:32 MST --- *** Bug 439210 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=439210 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c35 --- Comment #35 from Petr Baudis <pbaudis@novell.com> 2008-11-14 03:20:51 MST --- I'm planning to release packages at http://www.suse.de/~pbaudis/bug-387202 (will mirror out in an hour or two) as maintenance update for 11.0 in a short while. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c44 --- Comment #44 from Petr Baudis <pbaudis@novell.com> 2008-11-19 11:57:43 MST --- nscd still crashes with this patch and in 11.1 - very seldom for me, much more frequently for others. So I will hold this a little more and try to fix that crash too. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c45 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rickert@cs.niu.edu --- Comment #45 from Petr Baudis <pbaudis@novell.com> 2008-11-19 12:04:02 MST --- *** Bug 426396 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=426396 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c46 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |e.kunig@home.nl --- Comment #46 from Petr Baudis <pbaudis@novell.com> 2008-11-19 12:24:29 MST --- *** Bug 417865 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=417865 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c47 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |x_pedro_x@hotmail.com --- Comment #47 from Petr Baudis <pbaudis@novell.com> 2008-11-19 12:25:48 MST --- *** Bug 426679 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=426679 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c48 --- Comment #48 from Petr Baudis <pbaudis@novell.com> 2008-12-04 11:19:33 MST --- (Status update: In bug 446233, we have tested the patch to fix this issue and fixed another race condition which kicks in if nscd does not crash because of this one. Some people still report occasional crashes, but I don't have enough data to debug these. I will wait probably until next Tuesday and proceed to submit 11.0 update with all the nscd fixes we have by then, and some small extras.) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c49 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jmatejek@novell.com --- Comment #49 from Petr Baudis <pbaudis@novell.com> 2008-12-04 11:19:50 MST --- *** Bug 157078 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=157078 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c50 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |adaugherity@tamu.edu --- Comment #50 from Petr Baudis <pbaudis@novell.com> 2008-12-04 11:31:33 MST --- *** Bug 374990 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=374990 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User walter.haidinger@gmx.at added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c51 Walter Haidinger <walter.haidinger@gmx.at> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |walter.haidinger@gmx.at --- Comment #51 from Walter Haidinger <walter.haidinger@gmx.at> 2008-12-05 00:32:47 MST --- Based on the severity and priority of this bug: How about adding the watchdog workaround as a patch into the current nscd package and release it as an update? That is, modify nscd to start a master process which monitors its child(ren), restart them automatically upon death (maybe with a log entry) and have it kill them upon exit. This patch should neither be that much to add nor too difficult. Of course this is not a real fix and quite ugly, IMHO. However, it would be a _quick_ workaround for all 11.0 installations to make nscd more stable (from the users point of view), relieving the users from implementing a watchdog themselves. It would also buy some time until the real bug is found and squashed. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c52 --- Comment #52 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2008-12-05 03:36:57 MST --- Just to add that the unscd solution continues to work well for me. It has been running on a number of heavily loaded machines for over a month now without ever crashing, and there has been no sign of bad data. It is good to know progress is being made on the standard nscd as well. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User admin@fph.physik.uni-karlsruhe.de added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c53 --- Comment #53 from Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de> 2008-12-05 04:48:56 MST --- I also switched to unscd about 4 weeks ago on a pool of 34 machines. I haven't encountered any problem since. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c54 --- Comment #54 from Petr Baudis <pbaudis@novell.com> 2008-12-06 07:08:22 MST --- Walter: We already do have such a watchdog, it's called "init". Just adding nscd -d to /etc/inittab should work fine. :-) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User walter.haidinger@gmx.at added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c55 --- Comment #55 from Walter Haidinger <walter.haidinger@gmx.at> 2008-12-07 05:48:38 MST --- I see. So, I guess OpenSUSE 12.0 will scrap all those useless scripts in /etc/init.d and start everything from /etc/inittab, right? Nice. Maybe I need to clarify this: comment #51 _was_ meant seriously, no joke intended! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c56 --- Comment #56 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2008-12-08 02:33:15 MST --- Created an attachment (id=258547) --> (https://bugzilla.novell.com/attachment.cgi?id=258547) sample watchdog script In case it is useful, here is my version of the watchdog script, designed to be run as a cron job. It has a couple of good features: (1) can check other services, not just nscd (2) uses chkconfig to make sure it only checks services that are meant to be running -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User walter.haidinger@gmx.at added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c57 --- Comment #57 from Walter Haidinger <walter.haidinger@gmx.at> 2008-12-08 06:00:21 MST --- Nice script but because of comment #54 quite obsolete, isn't it? :-\ No, seriously, if such a wrapper would be implemented in nscd itself _all_ SUSE users would benefit, even those not capable of writing a wrapper themselves or even those not be able to find (say being aware of) this bugzilla entry. Again, this could be quickly released as an nscd update until the real fix is done (which we're waiting for since when? two years?). The required patch would only need to do the following (in pseudo-code): /* signal handler to kill spawned nscd child */ signal(SIGTERM, { kill(child_pid); } ); /* core loop to (re)spawn nscd child */ for (;;) { child_pid = fork(); if (child_pid == 0) { nscd_main(); /* run nscd main() */ } else { wait(); /* wait for nscd child to exit/die */ log("restarting nscd child"); } } Would that be too difficult? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c58 --- Comment #58 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2008-12-08 06:31:49 MST --- Nice idea but dangerous: imagine some condition that caused nscd to fail as soon as it started. Then the nscd parent would whirl round burning up CPU and your system would be much more messed up than if nscd just died. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User walter.haidinger@gmx.at added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c59 --- Comment #59 from Walter Haidinger <walter.haidinger@gmx.at> 2008-12-08 06:56:21 MST --- Then add a sleep() to wait a couple of seconds after each wait() to throttle respawning. This should be usually good practice in watchdog wrappers anyways, so I left it out. I said it's only pseudo-code. Moreover, logging will make you notice the problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User x_pedro_x@hotmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c60 --- Comment #60 from Pedro Oliveira <x_pedro_x@hotmail.com> 2008-12-09 05:23:41 MST --- Hi! I've switched to unscd too and it rocks, i'm using it in 32 and 64 bit environments, with a few servers and a with my lap. Never had a problem with it. with regular nscd well i have 2 simple scripts, to make it restart without much hassle: here they are: MartiniMan-lap:~/bin # cat nscd_check #!/bin/bash while true ; do if [ ! "`pidof nscd`" ] ; then echo "`date +%d:%m:%y-%H:%M:%S` restarting nscd" sudo rcnscd restart ; fi sleep 1 ; done ---------------------------------------------------------------------------------------------------------- Pedro Oliveira IT Consultant Email: pmsoliveira@gmail.com URL: http://pedro.linux-geex.com Telefone: +351 96 5867227 ---------------------------------------------------------------------------------------------------------- -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User x_pedro_x@hotmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c61 --- Comment #61 from Pedro Oliveira <x_pedro_x@hotmail.com> 2008-12-09 06:00:08 MST --- Sorry, I forgot the second scrip to make the previous one start automatically from RC. Just create this executable file: /etc/init.d/nscd_check #!/bin/sh ### BEGIN INIT INFO # Provides: nscd_check # Required-Start: nscd # Should-Start: # Required-Stop: # Should-Stop: # Default-Start: 3 5 # Default-Stop: 0 1 2 6 # Short-Description: check for nscd # Description: check if nscd is running and restarts it if not ### END INIT INFO # /etc/rc.status # Reset status of this service rc_reset case "$1" in start) echo -n "Starting nscd_check" nohup /sbin/nscd_check >> /var/log/messages & rc_status -v ;; stop) echo -n "Shutting down nscd_check" pkill nscd_check rc_status -v ;; restart) $0 stop $0 start rc_status ;; *) echo "Usage: $0 {start|stop}" exit 1 ;; esac rc_exit ##################################### after this just type: insserv nscd_check hope it helps. ---------------------------------------------------------------------------------------------------------- Pedro Oliveira IT Consultant Email: pmsoliveira@gmail.com URL: http://pedro.linux-geex.com Telefone: +351 96 5867227 ---------------------------------------------------------------------------------------------------------- -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User e.kunig@home.nl added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c62 --- Comment #62 from Egbert König <e.kunig@home.nl> 2008-12-24 07:59:15 MST --- nscd 2.9, as shipped with openSuSE 11.1, crashes too. I am using unscd now. Wouldn't it be reasonable to provide unscd as a patch for openSuSE 11.0 and 11.1? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jnelson-suse@jamponi.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c63 --- Comment #63 from Jon Nelson <jnelson-suse@jamponi.net> 2008-12-29 10:03:38 MST --- nscd remains crashy for me, too (opensuse 11.1) /me back to using unscd. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User swamp@suse.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c64 Swamp Script User <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard| |maint:released:11.0:21210 --- Comment #64 from Swamp Script User <swamp@suse.com> 2009-01-01 10:04:52 MST --- Update released for: glibc, glibc-devel, glibc-html, glibc-i18ndata, glibc-info, glibc-locale, glibc-obsolete, glibc-profile, nscd Products: openSUSE 11.0 (debug, i386, i686, ppc, ppc64, x86_64) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c65 --- Comment #65 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2009-01-05 06:51:18 MST --- nscd still crashes every hour or so after updating to nscd-2.8-14.2 on SuSE 11.0. I will reinstate unscd. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c66 --- Comment #66 from Petr Baudis <pbaudis@novell.com> 2009-01-05 09:46:53 MST --- If nscd still crashes for you, please: (i) Set persistent to 0 for all databases in your /etc/nscd.conf (ii) /etc/init.d/nscd stop and run ulimit -c unlimited; nscd -d (iii) When nscd crashes, please post a core here, compress it if it is larger than 1M or so. (iv) Also post your /etc/nsswitch.conf with the core. Without this information, I cannot fix any crashes; nscd on 11.0 crashed only once for me so far after this fix, and I don't have quite enough data to debug it yet, it seems. Thanks! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c67 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|Basesystem |Basesystem Product|openSUSE 11.0 |openSUSE 11.1 --- Comment #67 from Petr Baudis <pbaudis@novell.com> 2009-01-05 09:55:17 MST --- Egbert König: I plan to package unscd nicely in buildservice in the future, I'm just not sure when will I get to it. To clarify, there are two bugs: bug 387202 against 11.0 and bug 446233 against 11.1. Since nscd is basically the same in 11.0 and 11.1 by now and I will continue to keep them in sync, I'm going to mark 446233 dupe of this one and bump this one to 11.1; further nscd updates will be released for both 11.0 and 11.1. Both of these bugs are in fact many different bugs in nscd, (un)fortunately the unfixed ones trigger only rarely so they aren't as easy to debug. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c68 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tvrtko@ursulin.net --- Comment #68 from Petr Baudis <pbaudis@novell.com> 2009-01-05 09:55:26 MST --- *** Bug 446233 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=446233 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jnelson-suse@jamponi.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c69 --- Comment #69 from Jon Nelson <jnelson-suse@jamponi.net> 2009-01-05 10:24:45 MST --- If anybody cares, I *have* packaged it (although the packaging needs some work) by using bits from the nscd package. home:jnelson-suse if you like. I *have* seen unscd crash, but not the latest version (0.36), which has been very slightly patched to unlink the pidfile and sockets. I actively solicit improvements. I'M NOT RESPONSIBLE FOR ANYTHING THAT GOES WRONG. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c70 --- Comment #70 from Petr Baudis <pbaudis@novell.com> 2009-01-05 10:40:11 MST --- Sorry, of course I forgot to mention that - I'm using your work as a base for mine. :) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c71 --- Comment #71 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2009-01-06 04:51:45 MST --- (In reply to comment #66 from Petr Baudis)
If nscd still crashes for you, please:
(i) Set persistent to 0 for all databases in your /etc/nscd.conf (ii) /etc/init.d/nscd stop and run ulimit -c unlimited; nscd -d (iii) When nscd crashes, please post a core here, compress it if it is larger than 1M or so. (iv) Also post your /etc/nsswitch.conf with the core.
Without this information, I cannot fix any crashes; nscd on 11.0 crashed only once for me so far after this fix, and I don't have quite enough data to debug it yet, it seems.
Thanks!
I managed to get another crash, and will attach the requested info. nscd is version 2.8-14.2, running on SuSE 11.0. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User R.Vickers@cs.rhul.ac.uk added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c72 --- Comment #72 from Bob Vickers <R.Vickers@cs.rhul.ac.uk> 2009-01-06 04:58:08 MST --- Created an attachment (id=263361) --> (https://bugzilla.novell.com/attachment.cgi?id=263361) nscd core file plus log messages and config files -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User syseng@adnovum.ch added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c73 --- Comment #73 from Bernd Nies <syseng@adnovum.ch> 2009-01-07 02:57:20 MST --- Hi, Some good news while everybody is complaining: I installed all Suse 11.0 updates with "zypper update" and rebooted system two days ago and since then nscd keeps running. Before that it crashed every few hours and was restarted with my watchdog daemon. adnws001:~ # rpm -qa | egrep 'nscd|glibc' libnscd-2.0.2-81.1 nscd-2.8-14.2 glibc-2.8-14.2 glibc-locale-2.8-14.2 glibc-devel-2.8-14.2 glibc-info-2.8-14.2 adnws001:~ # uname -a Linux adnws001 2.6.25.18-0.2-pae #1 SMP 2008-10-21 16:30:26 +0200 i686 i686 i386 GNU/Linux Thanks a lot! Bye, Bernd -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User mmarek@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c74 --- Comment #74 from Michal Marek <mmarek@novell.com> 2009-01-07 06:42:32 MST --- FWIW, the 11.0 update package works for me so far. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c75 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |matz@novell.com --- Comment #75 from Petr Baudis <pbaudis@novell.com> 2009-01-20 17:35:02 MST --- *** Bug 467393 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=467393 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c76 --- Comment #76 from Hans-Peter Jansen <hpj@urpla.net> 2009-01-21 15:07:29 MST --- FWIW, another variant, this time with 11.1 (nscd-2.9-2.8): 10765: provide access to FD 12, for hosts 10765: Reloading "die-offenbachs.homelinux.org" in hosts cache! 10765: Reloading "0" in group cache! 10765: Reloading "2222" in group cache! 10765: remove GETHOSTBYNAME entry "localhost" 10765: remove GETPWBYUID entry "51" 10765: remove GETPWBYNAME entry "nobody" 10765: remove GETPWBYUID entry "65534" 10765: remove GETPWBYNAME entry "postfix" nscd: mem.c:412: gc: Zusicherung »next_data < &he_data[db->head->nentries]« nicht erfüllt. Abgebrochen When nscd crashed, amarok takes ages to start up (say 5-10 minutes!), with nscd it takes 2-5 secs. Now, that B O'B will setup a new world order, these bugs really cry for immediate fixes, Petr. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c77 --- Comment #77 from Petr Baudis <pbaudis@novell.com> 2009-01-21 15:57:13 MST --- Actually, I have just prepared a new round of nscd updates for 11.0 and 11.1, at http://www.suse.de/~pbaudis/bug-387202-2/ I'm sorry to those who I told before 11.1 and 11.0 nscd is identical, it turns out that the 11.1 glibc update I prepared last did not actually make it to 11.1. :-( So 11.0 should actually have much more stable nscd than 11.1 now. I will try to trigger another round of updates now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User swamp@suse.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c80 Swamp Script User <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard|maint:released:11.0:21210 |maint:released:11.0:21210 | |maint:running:22192 --- Comment #80 from Swamp Script User <swamp@suse.com> 2009-01-22 15:02:38 MST --- The SWAMPID for this issue is 22192. Please submit the patch and patchinfo file using this ID. (https://swamp.suse.de/webswamp/wf/22192) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c81 --- Comment #81 from Carlos Robinson <robin.listas@telefonica.net> 2009-01-23 17:54:26 MST --- Yet another watchdog. root's crontab entry: -0,*/5 * * * 1-7 /root/bin/watchdog_nscd > /dev/null script: #!/bin/bash # watchdog para reiniciar el servicio nscd # idea del case en "307:rc" /usr/sbin/rcnscd status start; status=$? echo "Status= "$status case $status in [1-47]) echo "failed" /bin/logger -p user.warn -t watchdog \ "nscd is not running, restarting. -- Bugzilla 387202; "\ "see root's crontab to disable this wd" /usr/sbin/rcnscd restart ;; [56]) echo "skipped" ;; 0|*) echo "Nothing to do" ;; esac I believe you should create some kind of watchdog and push it via YOU to systems, till this problem is really solved. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c82 --- Comment #82 from Carlos Robinson <robin.listas@telefonica.net> 2009-01-24 02:55:58 MST --- Sorry, errata in #81 "/usr/sbin/rcnscd status start" should be "/usr/sbin/rcnscd status", of course. It's of no consequence, anyway. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c83 --- Comment #83 from Hans-Peter Jansen <hpj@urpla.net> 2009-01-25 14:38:00 MST --- Petr, for what it worth, since I installed http://www.suse.de/~pbaudis/bug-387202-2/ nscd didn't crashed. A yast compatible repo structure for this dir would ease testing greatly, though. (Well, I use createrepo internally..). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c84 --- Comment #84 from Hans-Peter Jansen <hpj@urpla.net> 2009-01-28 03:18:10 MST --- Cheered too soon :-(. Crashed after three days, but I stopped the nscd debugging before my last post. Will set it up again now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User swamp@suse.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c85 Swamp Script User <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard|maint:released:11.0:21210 |maint:released:11.0:21210 |maint:running:22192 |maint:running:22192 | |maint:released:11.0:22204 --- Comment #85 from Swamp Script User <swamp@suse.com> 2009-02-02 04:16:48 MST --- Update released for: nscd Products: openSUSE 11.0 (i386, ppc, x86_64) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c86 --- Comment #86 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-03 12:06:59 MST --- I got what I think is that update: cer@nimrodel:~> rpm -q -i nscd Name : nscd Relocations: (not relocatable) Version : 2.8 Vendor: SUSE LINUX Products GmbH, Nuernberg, Germany Release : 14.4 Build Date: Sun 25 Jan 2009 10:06:27 PM CET Install Date: Tue 03 Feb 2009 04:21:08 AM CET Build Host: stravinsky.suse.de I had nscd crash twice today - ie, after the update: Feb 3 15:55:01 nimrodel watchdog: nscd is not running, restarting. -- Feb 3 17:55:01 nimrodel watchdog: nscd is not running, restarting. -- Feb 3 17:55:01 nimrodel nscd: 12295 invalid persistent database file "/var/run/nscd/passwd": verification failed admittedly, it is crashing less. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c87 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |robin.listas@telefonica.net --- Comment #87 from Petr Baudis <pbaudis@novell.com> 2009-02-03 12:14:00 MST --- Carlos, can you please follow the reporting guidelines I outlined in comment 66? Thank you. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c88 --- Comment #88 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-03 12:43:24 MST --- (In reply to comment #87)
Carlos, can you please follow the reporting guidelines I outlined in comment 66? Thank you.
Let me see...
(i) Set persistent to 0 for all databases in your /etc/nscd.conf
Huh? I have: nimrodel:~ # grep -i persistent /etc/nscd.conf # persistent <service> <yes|no> persistent passwd yes persistent group yes persistent hosts no persistent services yes What exactly do I edit? My configuration is your default supplied config, I think.
(ii) /etc/init.d/nscd stop and run ulimit -c unlimited; nscd -d
Done. I now have in the startup script this: case "$1" in start) echo -n "Starting Name Service Cache Daemon" #/sbin/startproc -p $NSCD_PID $NSCD_BIN # Bug 387202#c66 ulimit -c unlimited /sbin/startproc -p $NSCD_PID $NSCD_BIN -d rc_status -v ;; If this is not adequate, please tell me how I change the script - it has to be that way, I have a watchdog restarting the daemon automatically. [...] No, it is not adequate, status says "unused". Undoing the "-d" till you expand the instructions. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User bernet@physik.unizh.ch added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c89 Roland Bernet <bernet@physik.unizh.ch> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bernet@physik.unizh.ch --- Comment #89 from Roland Bernet <bernet@physik.unizh.ch> 2009-02-03 12:52:59 MST --- Hi Petr, Tried several times with ulimit -c unlimited; nscd -d and nscd does crash, but I never get a core dump ... Tried with a small script dividing by 0 and it writes a core. Any ideas how to get a core dump of a nscd crash? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c90 --- Comment #90 from Hans-Peter Jansen <hpj@urpla.net> 2009-02-03 16:20:31 MST --- I wasn't able to get the version from #77 crash - as long as I run it in debug mode - unlike running it as an ordinary runlevel service. Since the debug mode prevents nscd from forking, maybe some fork or clone related race condition in nscd is the real McCoy in this issue. Roland, Carlos please keep the ulimit -c unlimited; nscd -d running in a terminal, and be sure, that rcnscd is not running. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c91 --- Comment #91 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-03 19:01:35 MST --- (In reply to comment #90)
Roland, Carlos please keep the ulimit -c unlimited; nscd -d running in a terminal, and be sure, that rcnscd is not running.
Well, I have done just that, but I still need clarification on "persistent" configuration, as per #88 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c92 Carlos Robinson <robin.listas@telefonica.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|robin.listas@telefonica.net | --- Comment #92 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-03 20:48:00 MST --- Ok, nscd just crashed. Output in window was: .. 921: GETFDGR 8921: provide access to FD 6, for group 8921: handle_request: request received (Version = 2) from PID 10892 8921: GETFDGR 8921: provide access to FD 6, for group 8921: remove GETPWBYUID entry "51" 8921: remove GETPWBYNAME entry "nobody" 8921: remove GETPWBYUID entry "65534" 8921: remove GETPWBYNAME entry "postfix" nscd: mem.c:368: gc: Assertion `off_allocend <= db->head->first_free' failed. nimrodel:~/Bugzilla/Bug_387202 # There is no core in that directory. Config: cer@nimrodel:~> cat /etc/nscd.conf | egrep -v "^[[:space:]]*$|^#" debug-level 0 paranoia no enable-cache passwd yes positive-time-to-live passwd 600 negative-time-to-live passwd 20 suggested-size passwd 211 check-files passwd yes persistent passwd yes shared passwd yes max-db-size passwd 33554432 auto-propagate passwd yes enable-cache group yes positive-time-to-live group 3600 negative-time-to-live group 60 suggested-size group 211 check-files group yes persistent group yes shared group yes max-db-size group 33554432 auto-propagate group yes enable-cache hosts yes positive-time-to-live hosts 600 negative-time-to-live hosts 0 suggested-size hosts 211 check-files hosts yes persistent hosts no shared hosts yes max-db-size hosts 33554432 enable-cache services yes positive-time-to-live services 28800 negative-time-to-live services 20 suggested-size services 211 check-files services yes persistent services yes shared services yes max-db-size services 33554432 cer@nimrodel:~> -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c93 --- Comment #93 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-03 20:53:21 MST --- I forgot: cer@nimrodel:~> cat /etc/nsswitch.conf | egrep -v "^[[:space:]]*$|^#" passwd: compat group: compat hosts: files mdns4_minimal [NOTFOUND=return] dns networks: files dns services: files protocols: files rpc: files ethers: files netmasks: files netgroup: files nis publickey: files bootparams: files automount: files nis aliases: files cer@nimrodel:~> And to avoid confusions, I'm on 11.0 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c94 --- Comment #94 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-04 05:48:06 MST --- One more: 11415: GETFDPW 11415: provide access to FD 4, for passwd 11415: Reloading "0.pool.ntp.org" in hosts cache! 11415: Reloading "1.ch.pool.ntp.org" in hosts cache! 11415: Reloading "0.es.pool.ntp.org" in hosts cache! 11415: Reloading "1.pool.ntp.org" in hosts cache! 11415: Reloading "2.pool.ntp.org" in hosts cache! 11415: Reloading "0.ch.pool.ntp.org" in hosts cache! 11415: Reloading "3.pool.ntp.org" in hosts cache! 11415: Reloading "0.uk.pool.ntp.org" in hosts cache! 11415: Reloading "users.opensuse.org" in hosts cache! 11415: Reloading "0.fr.pool.ntp.org" in hosts cache! 11415: remove GETAI entry "0.pool.ntp.org" 11415: remove GETAI entry "1.ch.pool.ntp.org" 11415: remove GETAI entry "0.es.pool.ntp.org" 11415: remove GETAI entry "1.pool.ntp.org" 11415: remove GETAI entry "2.pool.ntp.org" 11415: remove GETAI entry "0.ch.pool.ntp.org" 11415: remove GETAI entry "nimrodel" 11415: remove GETAI entry "3.pool.ntp.org" 11415: remove GETAI entry "0.uk.pool.ntp.org" 11415: remove GETAI entry "0.fr.pool.ntp.org" 11415: remove GETPWBYNAME entry "upsd" 11415: remove GETPWBYUID entry "115" nscd: mem.c:477: gc: Assertion `next_hash == &he[db->head->nentries]' failed. nimrodel:~/Bugzilla/Bug_387202 # nimrodel:~/Bugzilla/Bug_387202 # ulimit -c unlimited; nscd -d 15414: invalid persistent database file "/var/run/nscd/passwd": verification failed -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c95 --- Comment #95 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-04 16:54:25 MST --- One more: 15414: handle_request: request received (Version = 2) from PID 17943 15414: GETFDGR 15414: provide access to FD 6, for group 15414: remove GETPWBYUID entry "1000" 15414: remove GETPWBYNAME entry "cer" 15414: handle_request: request received (Version = 2) from PID 7657 15414: GETAI (www.os-translation.com.ar) 15414: remove GETPWBYNAME entry "lp" 15414: remove GETPWBYUID entry "4" nscd: mem.c:368: gc: Assertion `off_allocend <= db->head->first_free' failed. Aborted Another: 21408: GETFDPW 21408: provide access to FD 4, for passwd 21408: handle_request: request received (Version = 2) from PID 1113 21408: GETFDPW 21408: provide access to FD 4, for passwd 21408: remove GETPWBYUID entry "101" 21408: remove GETPWBYNAME entry "messagebus" nscd: mem.c:368: gc: Assertion `off_allocend <= db->head->first_free' failed. Aborted -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c96 --- Comment #96 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-05 04:53:25 MST --- Another one: 2138: Reloading "0.pool.ntp.org" in hosts cache! 2138: remove GETAI entry "0.pool.ntp.org" 2138: Reloading "1.ch.pool.ntp.org" in hosts cache! 2138: Reloading "0.es.pool.ntp.org" in hosts cache! 2138: Reloading "1.pool.ntp.org" in hosts cache! 2138: Reloading "2.pool.ntp.org" in hosts cache! 2138: Reloading "0.ch.pool.ntp.org" in hosts cache! 2138: Reloading "3.pool.ntp.org" in hosts cache! 2138: Reloading "0.uk.pool.ntp.org" in hosts cache! 2138: Reloading "0.fr.pool.ntp.org" in hosts cache! 2138: remove GETAI entry "1.ch.pool.ntp.org" 2138: remove GETAI entry "0.es.pool.ntp.org" 2138: remove GETAI entry "1.pool.ntp.org" 2138: remove GETAI entry "2.pool.ntp.org" 2138: remove GETAI entry "0.ch.pool.ntp.org" 2138: remove GETAI entry "3.pool.ntp.org" 2138: remove GETAI entry "0.uk.pool.ntp.org" 2138: remove GETAI entry "0.fr.pool.ntp.org" 2138: remove GETPWBYUID entry "51" 2138: remove GETPWBYNAME entry "nobody" 2138: remove GETPWBYUID entry "65534" 2138: remove GETPWBYNAME entry "postfix" 2138: remove GETPWBYNAME entry "upsd" 2138: remove GETPWBYUID entry "115" Segmentation fault nimrodel:~/Bugzilla/Bug_387202 # l total 8 drwxr-xr-x 2 root root 4096 Feb 4 02:00 ./ drwxrwxr-x 33 cer root 4096 Feb 4 01:58 ../ -rw-r--r-- 1 root root 0 Feb 4 04:45 nscd.log Feb 5 12:40:51 nimrodel kernel: nscd[2139]: segfault at fffffdc4 ip b7f8c822 sp adda5f5c error 4 in nscd[b7f7c000+1c000] Well, as I see no comments on how to produce that core file, and it keeps crashing, I'm restarting the "normal" service with automatic watchdog restarting the service, instead of "nscd -d" in a terminal. I will not comment further unless I get feedback to the contrary, I see no point. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c97 --- Comment #97 from Hans-Peter Jansen <hpj@urpla.net> 2009-02-05 05:28:09 MST ---
Well, as I see no comments on how to produce that core file, and it keeps crashing, I'm restarting the "normal" service with automatic watchdog restarting the service, instead of "nscd -d" in a terminal. I will not comment further unless I get feedback to the contrary, I see no point.
You truely have a point here, Carlos. FWIW, your crashes nicely sum up, what I see very sporadic here. Petr, I really wonder, why you don't provide the nscd debug packages (see #77). Otherwise, similar to Carlos, I see no point in running nscd via gdb... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User matz@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c98 --- Comment #98 from Michael Matz <matz@novell.com> 2009-02-05 07:18:33 MST --- The debuginfo packages are right there where Petr said in comment #77. Maybe you are confused that there's no nscd-debug{info,source}? That's because debug packages don't exist for subpacks (and nscd is one of glibc), you need to install glibc-debuginfo (-debugsource). Also, Carlos: you didn't yet follow the guidelines of comment #66. You still use persistent databases. Yes, the comment talks about setting it to "0". Of course instead you should use "no" for all databases you have in nscd.conf. See nscd.conf(5). But indeed, more logs are not necessary I think, we see the assertions that cause nscd to exit. core file would be a bit more usefull. They aren't placed into the current pwd, but into the working dir of nscd, which usually is '/' (nscd chdir's into that one as daemon). You probably have some lying around there. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User swamp@suse.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c99 Swamp Script User <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard|maint:released:11.0:21210 |maint:released:11.0:21210 |maint:running:22192 |maint:running:22192 |maint:released:11.0:22204 |maint:released:11.0:22204 | |maint:released:11.1:22237 --- Comment #99 from Swamp Script User <swamp@suse.com> 2009-02-05 07:44:46 MST --- Update released for: glibc, glibc-debuginfo, glibc-debugsource, glibc-devel, glibc-html, glibc-i18ndata, glibc-info, glibc-locale, glibc-obsolete, glibc-profile, nscd Products: openSUSE 11.1 (debug, i586, i686, ppc, ppc64, x86_64) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 Swamp Script User <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard|maint:released:11.0:21210 |maint:released:11.0:21210 |maint:running:22192 |maint:released:11.0:22204 |maint:released:11.0:22204 | |maint:released:11.1:22237 | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c100 --- Comment #100 from Carlos Robinson <robin.listas@telefonica.net> 2009-02-05 07:58:42 MST --- (In reply to comment #98)
The debuginfo packages are right there where Petr said in comment #77. Maybe you are confused that there's no nscd-debug{info,source}? That's because debug packages don't exist for subpacks (and nscd is one of glibc), you need to install glibc-debuginfo (-debugsource).
Well, if you want me to install something in order to produce the coredump, tell me what exactly do I install.
Also, Carlos: you didn't yet follow the guidelines of comment #66. You still use persistent databases. Yes, the comment talks about setting it to "0". Of course instead you should use "no" for all databases you have in nscd.conf. See nscd.conf(5).
No, I didn't, I said in #88 and #91 that I needed clarification. I still do. Do you mean I should use: persistent whatever no
But indeed, more logs are not necessary I think, we see the assertions that cause nscd to exit. core file would be a bit more usefull. They aren't placed into the current pwd, but into the working dir of nscd, which usually is '/' (nscd chdir's into that one as daemon). You probably have some lying around there.
One of the crashes was not an assertion but a segfault. No, there is no core on /. I can run an "updatedb; locate corewhatever", but I need to know the exact name to search for, because it yields up of 5713 entries. Or alternatively, an exact "find" command to find it. As far as I can see, there is no core in: / /tmp /var/run/nscd /root/Bugzilla/Bug_387202 <-- pwd where I run "nscd -d" /root /home/cer Note: the "nscd -d" command runs on an xterm where I did "su -" to root, in order to keep an eye on it. The xterm is under gnome. I remember some mention years ago of X blocking coredumps. But somebody here said he managed to produce coredumps with a code dividing by zero, so it must be nscd which impedes them :-? Suggestion: search for all assertions in the code and replace/add logger calls. In my programming days, an assertion was the last resource to use, and never in production code. It was used instead of proper code to find an unexpected situation, never as error handling code. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c101 --- Comment #101 from Hans-Peter Jansen <hpj@urpla.net> 2009-02-05 12:06:38 MST ---
The debuginfo packages are right there where Petr said in comment #77. Maybe you are confused that there's no nscd-debug{info,source}?
Yes.
That's because debug packages don't exist for subpacks (and nscd is one of glibc), you need to install glibc-debuginfo (-debugsource).
Done that already. As noted before, it would be MUCH easier for every tester, if Petr would provide a zypp compatible repo structure over there: Then people could add the repo target, update and install additional packages.
But indeed, more logs are not necessary I think, we see the assertions that cause nscd to exit. core file would be a bit more usefull. They aren't placed into the current pwd, but into the working dir of nscd, which usually is '/' (nscd chdir's into that one as daemon). You probably have some lying around there.
I got a nscd segfault (a few days ago) too, assertions also, but no core:
for f in $(locate core | egrep '\<core$'); do [ -f $f ] && l $f; done lrwxrwxrwx 1 root root 11 7. Jan 22:20 /dev/core -> /proc/kcore lrwxrwxrwx 1 root root 11 25. Dez 14:30 /lib/udev/devices/core -> /proc/kcore -rw-r--r-- 1 root root 213 3. Dez 08:26 /var/adm/perl-modules/yast2-core
For whatever reason, something prevents the kernel from creating a nscd core file. Since two people see this behavior, I bet you won't see ANY cores from somebody as long as you cannot tell us how! Read: try to simulate an assertion or segfault with nscd, and get it to produce one. I bet again, that this fails also. Now find the reason, tell us, and we're back into the game. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User bernet@physik.unizh.ch added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c102 --- Comment #102 from Roland Bernet <bernet@physik.unizh.ch> 2009-02-06 01:32:05 MST --- Created an attachment (id=270688) --> (https://bugzilla.novell.com/attachment.cgi?id=270688) nscd backtrace nscd crashed again on my openSUSE 11.0 with nscd-2.8-14.4. Still no core dump, but this time a backtrace: 23851: provide access to FD 6, for group 23851: Reloading "20915" in group cache! *** glibc detected *** nscd: corrupted double-linked list: 0xb7f8d6e0 *** ======= Backtrace: ========= /lib/libc.so.6[0xb7de3fc4] /lib/libc.so.6[0xb7de4264] . . full output in the attached tar.gz file. It also includes my /etc/nscd.conf and /etc/nsswitch.conf files. Hope it helps ... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User matz@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c103 --- Comment #103 from Michael Matz <matz@novell.com> 2009-02-06 07:15:11 MST --- re #101: core files for multi-thread processes (which nscd is) aren't named "core", but rather "core.$PID", hence your egrep pattern won't find them.
Read: try to simulate an assertion or segfault with nscd, and get it to produce one.
Easy: % ulimit -c unlimited % nscd -d & [1] 28280 % kill -SEGV $! [1]+ Segmentation fault (core dumped) nscd -d % pwd; ls -l core.28280 / -rw------- 1 root root 42344448 2009-02-06 15:10 core.28280 without debug: % nscd % pidof nscd 28304 % kill -SEGV $(pidof nscd) % ls -l core.28304 -rw------- 1 root root 42344448 2009-02-06 15:12 core.28304 % file core.28280 core.28304 core.28280: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'nscd -d' core.28304: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'nscd' -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c104 --- Comment #104 from Hans-Peter Jansen <hpj@urpla.net> 2009-02-06 08:28:35 MST --- Michael, not here, unfortunately: xrated:/# export LANG=C xrated:/# cat /etc/SuSE-release openSUSE 11.1 (i586) VERSION = 11.1 xrated:/# ulimit -c unlimited xrated:/# nscd -d & [1] 8139 xrated:/# kill -SEGV $! xrated:/# pwd; ls -l core* / ls: cannot access core*: No such file or directory [1]+ Segmentation fault nscd -d xrated:/# nscd xrated:/# pidof nscd 8206 xrated:/# kill -SEGV $(pidof nscd) xrated:/# ls -l core* ls: cannot access core*: No such file or directory xrated:/# uname -a Linux xrated 2.6.27.7-9-pae #1 SMP 2008-12-04 18:10:04 +0100 i686 athlon i386 GNU/Linux Do you remember any further details, which may prevent core dumping? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User mmarek@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c105 --- Comment #105 from Michal Marek <mmarek@novell.com> 2009-02-06 08:34:06 MST --- Does $ /sbin/sysctl -a | grep kernel.core show any unusual settings? Like kernel.core_pattern with an absolute path? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c106 --- Comment #106 from Hans-Peter Jansen <hpj@urpla.net> 2009-02-06 08:37:54 MST --- No. xrated:/# /sbin/sysctl -a | grep kernel.core kernel.core_uses_pid = 0 kernel.core_pattern = core -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User matz@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c107 --- Comment #107 from Michael Matz <matz@novell.com> 2009-02-06 09:18:45 MST --- Is there enough space on '/'? Also note that your segfault message doesn't include the "(core dumped)" string, so it's really not even attempting to dump core. Very strange. What does 'ulimit -a' say in that very shell, after doing 'ulimit -c unlimited' and the forced segfault in nscd? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User jnelson-suse@jamponi.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c108 --- Comment #108 from Jon Nelson <jnelson-suse@jamponi.net> 2009-02-06 09:40:48 MST --- apparmor may be getting in the way here, too. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c109 --- Comment #109 from Hans-Peter Jansen <hpj@urpla.net> 2009-02-06 13:57:15 MST --- Bingo, that was the missing hint: xrated:/# rcapparmor stop Unloading AppArmor profiles done xrated:/# nscd -d & [1] 18566 xrated:/# kill -SEGV $! xrated:/# ls -l core* -rw------- 1 root root 143011840 Feb 6 21:27 core.18566 [1]+ Segmentation fault (core dumped) nscd -d I've filed a bugzilla report about this sillyness: https://bugzilla.novell.com/show_bug.cgi?id=473529 Would you please vote for it, thanks. Now back to the 'real' problem, will keep nscd running in 'observation' mode. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User bernet@physik.unizh.ch added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c110 --- Comment #110 from Roland Bernet <bernet@physik.unizh.ch> 2009-02-10 03:34:29 MST --- Created an attachment (id=271484) --> (https://bugzilla.novell.com/attachment.cgi?id=271484) nscd core, nscd.conf, nsswitch.conf nscd core dump from a openSUSE 11.0 system. I have in addition added to the tar file /etc/nscd.conf, /etc/nsswitch.conf and the standard output. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User hpj@urpla.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c111 --- Comment #111 from Hans-Peter Jansen <hpj@urpla.net> 2009-02-12 14:00:42 MST --- Created an attachment (id=272448) --> (https://bugzilla.novell.com/attachment.cgi?id=272448) nscd core, nscd.conf, nsswitch.conf Here's one, that seems new (11.1 updated): 1827: provide access to FD 12, for hosts 1827: handle_request: request received (Version = 2) from PID 30345 1827: GETPWBYNAME (root) 1827: handle_request: request received (Version = 2) from PID 30345 1827: GETPWBYNAME (root) 1827: handle_request: request received (Version = 2) from PID 30345 1827: GETPWBYNAME (root) 1827: handle_request: request received (Version = 2) from PID 30345 1827: GETPWBYNAME (root) 1827: handle_request: request received (Version = 2) from PID 30345 1827: GETPWBYNAME (root) 1827: handle_request: request received (Version = 2) from PID 30346 1827: GETFDGR 1827: provide access to FD 9, for group 1827: handle_request: request received (Version = 2) from PID 30377 1827: GETFDHST 1827: provide access to FD 12, for hosts 1827: remove GETAI entry "xrated" 1827: remove GETHOSTBYADDR entry "127.0.0.2" 1827: Reloading "0" in password cache! 1827: remove INITGROUPS entry "root" nscd: mem.c:477: gc: Assertion `next_hash == &he[db->head->nentries]' failed. Aborted (core dumped) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User harbaugh@ncifcrf.gov added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c114 Toni Harbaugh-Blackford <harbaugh@ncifcrf.gov> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |harbaugh@ncifcrf.gov --- Comment #114 from Toni Harbaugh-Blackford <harbaugh@ncifcrf.gov> 2009-02-27 04:59:49 MST --- This bug is marked NEEDINFO, but what info is needed? I've lost track. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c115 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P1 - Urgent |P3 - Medium Status|NEEDINFO |ASSIGNED Info Provider|pbaudis@novell.com | Severity|Critical |Major --- Comment #115 from Petr Baudis <pbaudis@novell.com> 2009-03-24 05:40:39 MST --- Removing bogus NEEDINFO; there are some new crashes to look at, but I'm decreasing priority and severity since they seem to happen much more rarely. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c116 --- Comment #116 from Carlos Robinson <robin.listas@telefonica.net> 2009-03-24 09:04:41 MST --- (In reply to comment #115)
Removing bogus NEEDINFO; there are some new crashes to look at, but I'm decreasing priority and severity since they seem to happen much more rarely.
It crashes several times per day here - see some of my last watchdog entries - I have seen it crash four times in an hour: Mar 21 03:35:02 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 21 06:45:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 21 14:20:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 21 16:40:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 21 17:55:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 21 22:55:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 22 04:45:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 22 06:40:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 22 12:45:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 22 13:20:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 22 14:45:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 22 17:05:02 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 22 20:45:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 22 22:45:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 00:05:02 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 03:20:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 03:25:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 03:30:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 03:35:02 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 03:45:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 05:45:02 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 13:50:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 23 23:05:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd Mar 24 02:20:01 nimrodel watchdog: nscd is not running, restarting. -- Bugzilla 387202; see root's crontab to disable this wd You should clear all asserts from the C code: they are not logged to syslog, only to console. Not all crashes are segfaults - see the kernel log for the same period: Mar 21 03:34:27 nimrodel kernel: nscd[3461]: segfault at bffe0178 ip b7e8e32e sp afe1d034 error 6 in libc-2.8.so[b7e20000+13d000] Mar 21 22:51:12 nimrodel kernel: nscd[19438]: segfault at b8000012 ip b7f66825 sp addbee6c error 4 in nscd[b7f56000+1c000] Mar 22 06:37:28 nimrodel kernel: nscd[31545]: segfault at bfffe44c ip b7fe4822 sp ade3ceec error 4 in nscd[b7fd4000+1c000] Mar 22 20:41:34 nimrodel kernel: nscd[10540]: segfault at bfff66bc ip b80bc825 sp adf14f5c error 4 in nscd[b80ac000+1c000] Mar 23 03:19:59 nimrodel kernel: nscd[17683]: segfault at fff1518c ip b7e42450 sp ade070b8 error 4 in libc-2.8.so[b7dd5000+13d000] cer@nimrodel:~> -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c117 --- Comment #117 from Petr Baudis <pbaudis@novell.com> 2009-03-25 13:49:02 MST --- Clearing asserts will just make nscd segfault few moments later, at a place that's even much harder to debug. :-( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User robin.listas@telefonica.net added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c118 --- Comment #118 from Carlos Robinson <robin.listas@telefonica.net> 2009-03-25 17:16:06 MST --- Of course. Clearing an assert doesn't mean comment it out, just handle the error condition cleanly. Once an assert triggers, it means that a situation thought impossible by the programmer has in fact happened, and thus, the code has to be changed to avoid that situation happening. On the other hand, an assert in such a daemon just kills the daemon silently, without any message to the user/admin. An assert is intended as a message from the dead to the creator of the program, so that the creator can reprogram the cylon. This is not happening. Those asserts are useless. Instead the assert message should be sent to syslog with "warn" or "critical" level, and then the program halted - after logging the situation -. At best, the assert could be used to restart or reinit the daemon (the idea of dividing nscd into a parent and child is not so bad). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=387202 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=387202#c119 --- Comment #119 from Petr Baudis <pbaudis@novell.com> 2009-03-25 17:24:07 MST --- Changing the code to avoid the situation happening is the hard part, unfortunately. ;-) I agree that it would be nice if the assert() would be syslogged. I will try to make a patch when I finish the more urgent things on my hands. The asserts still certainly aren't useless, since a mere assert does not help anything anyway - you need to grab a core dump in order to really debug stuff. 95% of the crashes happen during database prune cycle; at this point, little but complete state reset can be done, and that's then pretty much equivalent to the watchdog solution. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com