[Bug 249809] New: memory leak in sysp will kill y2base in second stage
https://bugzilla.novell.com/show_bug.cgi?id=249809 Summary: memory leak in sysp will kill y2base in second stage Product: openSUSE 10.3 Version: Alpha 1plus Platform: PowerPC OS/Version: Linux Status: NEW Severity: Normal Priority: P5 - None Component: Installation AssignedTo: yast2-maintainers@suse.de ReportedBy: olh@novell.com QAContact: jsrain@novell.com CC: power-bugs@forge.provo.novell.com 10.3a1+ on mango.suse.de this ibook has 160mb. once yast gets to the hardware configuration screen, sysp is started. it consumes all memory. The kernel kills y2base (and other processes) because it runs out of memory. This top snapshot was taken while yast was killed. The sysp process kept running until the network was shut down. Shortly before that, it consumed more than 120MB of memory. top - 16:59:38 up 50 min, 1 user, load average: 1.67, 0.47, 0.28 Tasks: 52 total, 4 running, 48 sleeping, 0 stopped, 0 zombie Cpu(s): 63.4%us, 5.0%sy, 0.0%ni, 0.0%id, 30.7%wa, 1.0%hi, 0.0%si, 0.0%st Mem: 157152k total, 155324k used, 1828k free, 44k buffers Swap: 136544k total, 136544k used, 0k free, 4388k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6272 root 18 0 93840 82m 528 R 64.5 54.1 0:14.41 sysp 3775 root 18 0 238m 48m 2852 S 0.0 31.4 7:26.19 y2base 6231 root 20 0 24764 4840 836 S 0.0 3.1 0:01.01 init.pl 5717 root 18 0 8956 2304 400 S 0.0 1.5 0:01.30 ag_uid 3706 root 15 0 25544 1740 688 S 0.0 1.1 0:13.16 Xorg 3050 haldaemo 15 0 7112 1144 712 S 0.0 0.7 0:03.69 hald 3714 root 15 0 10052 880 632 S 0.0 0.6 0:00.81 fvwm2 4673 root 15 0 14284 852 664 S 0.0 0.5 0:00.09 NetworkManager 4380 root 15 0 12180 676 540 S 0.0 0.4 0:00.65 sshd 4433 root 15 0 2836 656 488 R 2.0 0.4 0:16.13 top 4672 root 15 0 3912 564 560 S 0.0 0.4 0:00.03 NetworkManagerD 4671 root 15 0 2956 544 440 R 0.0 0.3 0:00.03 dhcdbd 3051 root 25 0 3524 432 428 S 0.0 0.3 0:00.07 hald-runner 4338 root 25 0 4440 432 400 S 0.0 0.3 0:00.19 ag_initscripts 4416 root 16 0 5052 416 416 S 0.0 0.3 0:00.42 bash 1890 root 25 0 4440 412 408 S 0.0 0.3 0:00.77 YaST2.Second-St 3172 root 25 0 4572 412 408 S 0.0 0.3 0:00.49 YaST2.call 4094 root 20 0 4440 400 400 S 0.0 0.3 0:00.07 ag_xauth 10.2 installed ok 10.3a1 installed ok, module the hwinfo --framebuffer bug that prevented sax from working. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #1 from olh@novell.com 2007-02-28 09:11 MST ------- Created an attachment (id=121612) --> (https://bugzilla.novell.com/attachment.cgi?id=121612&action=view) bug249809.tar.bz2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ma@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|yast2-maintainers@suse.de |ms@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ms@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|ms@novell.com |sndirsch@novell.com ------- Comment #2 from ms@novell.com 2007-03-01 06:25 MST ------- while running in gdb I got: gdb --args sysp -s xstuff Program received signal SIGABRT, Aborted. 0x0fad3b80 in raise () from /lib/libc.so.6 (gdb) bt #0 0x0fad3b80 in raise () from /lib/libc.so.6 #1 0x0fad54c0 in abort () from /lib/libc.so.6 #2 0x0facb77c in __assert_fail () from /lib/libc.so.6 #3 0x0fa1bc0c in xcb_xlib_lock () from /usr/lib/libxcb-xlib.so.0 #4 0x0fdad77c in ?? () from /usr/lib/libX11.so.6 #5 0x0fd08d58 in XF86MiscGetMouseSettings () from /usr/lib/libXxf86misc.so.1 If I set: unset DISPLAY sysp -s xstuff I got: Card0 => DDC : <undefined> Card0 => Name : Monitor Card0 => Vendor : Generic Card0 => Primary : 00-16-0 Card0 => Chipset : <undefined> Card0 => Vsync : 61 Card0 => Hsync : 38 Card0 => Vesa : 800 600 37 60 Card0 => FbTiming : "800x600" 39.95 800 856 984 1056 600 601 605 628 +HSy nc +VSync Card0 => Dacspeed : 220 Card0 => Modeline : <undefined> Card0 => Memory : 4096 Card0 => Current : 00-16-0 Card0 => RawDef : None Card0 => Option : None Card0 => Extension : None Card0 => Module : ati Card0 => Display : CRT Card0 => VesaBios : <undefined> so this looks good. I think the problem is in X while using the misc extension -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 sndirsch@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mhopf@novell.com, eich@novell.com, ms@novell.com Status|NEW |NEEDINFO Info Provider| |olh@novell.com ------- Comment #3 from sndirsch@novell.com 2007-03-01 06:47 MST ------- This is probably another application bug. libxcb-1.0/src/xcb_xlib.c: [...] void xcb_xlib_lock(xcb_connection_t *c) { _xcb_lock_io(c); assert(!c->xlib.lock); c->xlib.lock = 1; c->xlib.thread = pthread_self(); _xcb_unlock_io(c); } We need some -debuginfo packages, so we get a more useful backtrace and can debug this. Unfortunately the filesystem is to small on mango. Could you fix this, Olaf? Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 olh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|olh@novell.com | ------- Comment #4 from olh@novell.com 2007-03-01 07:14 MST ------- there is now 1G available. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 sndirsch@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Priority|P5 - None |P2 - High -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #5 from sndirsch@novell.com 2007-03-01 12:16 MST ------- JFYI, without gdb: # sysp -s xstuff sysp: xcb_xlib.c:41: xcb_xlib_lock: Assertion `!c->xlib.lock' failed. Aborted -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #6 from sndirsch@novell.com 2007-03-01 13:29 MST ------- "XF86MiscGetMouseSettings" is called twice by xstuff (first in disableMouse and second in enableMouse). The second time it's called we see the assertion, when "LockDisplay(dpy)" is called. This looks like a missing "UnlockDisplay(dpy)" in "XF86MiscGetMouseSettings". BTW, I found other XF86Misc functions, which are also lacking "UnlockDisplay(dpy)" calls. sysp/lib/hw/mouse.c: [...] if (haveDisplay) { disableMouse (dpy); } [...] if (haveDisplay) { enableMouse (dpy); } [...] //=================================== // enableMouse... //----------------------------------- void enableMouse (Display* dpy) { XSetErrorHandler (catchErrors); XF86MiscMouseSettings mseinfo; if (!XF86MiscGetMouseSettings(dpy, &mseinfo)) { return; } mseinfo.flags |= MF_REOPEN; XF86MiscSetMouseSettings(dpy, &mseinfo); XSync(dpy, False); } //=================================== // disableMouse... //----------------------------------- void disableMouse (Display* dpy) { XSetErrorHandler (catchErrors); XF86MiscMouseSettings mseinfo; if (!XF86MiscGetMouseSettings(dpy, &mseinfo)) { return; } mseinfo.flags |= MF_REOPEN; mseinfo.device = "/dev/unused"; XF86MiscSetMouseSettings(dpy, &mseinfo); XSync(dpy, False); } [...] -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #7 from sndirsch@novell.com 2007-03-01 13:43 MST ------- Created an attachment (id=121892) --> (https://bugzilla.novell.com/attachment.cgi?id=121892&action=view) libXxf86misc-xcb.diff Patch for libXxf86misc -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #8 from sndirsch@novell.com 2007-03-01 14:19 MST ------- The patch fixes the assertion, but makes the machine unresponsible after some while. :-( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 sndirsch@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |olh@novell.com ------- Comment #9 from sndirsch@novell.com 2007-03-01 14:27 MST ------- On a x86 machine the patch doesn't hurt. Olaf, could you reboot mango (ping still works, but ssh doesn't respond any longer)? Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 olh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|olh@novell.com | ------- Comment #10 from olh@novell.com 2007-03-02 03:55 MST ------- it is back now. its the ibook on my desk. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #11 from sndirsch@novell.com 2007-03-02 04:41 MST ------- Machine is unresponsible again. Seems to be reproducable. :-( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #12 from sndirsch@novell.com 2007-03-02 04:43 MST ------- Not really. Afer some minutes I got: # sysp -s xstuff sysp: xcb_io.c:463: _XRead: Assertion `dpy->xcb->reply_consumed + size <= dpy->xcb->reply_length' failed. Aborted -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #13 from sndirsch@novell.com 2007-03-02 04:56 MST ------- Trying to run it in gdb right now ... which makes the machine currently unresponsive. :-( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #14 from eich@novell.com 2007-03-02 05:33 MST ------- This code path is taken in an error situation. It probably never happened before. It would be interesting to find out why the Xcalloc () fails which leads to this error condition. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #15 from eich@novell.com 2007-03-02 05:38 MST ------- Stefan, could you try to get the value of rep.devnamelen? If the value is too big this could point to a problem with the endian conversion on the protocol level. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #16 from olh@novell.com 2007-03-02 05:45 MST ------- I tried to run it with ulimit -v on mac.suse.de, but could not reproduce it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #17 from sndirsch@novell.com 2007-03-02 06:21 MST ------- Olaf could you reboot mango again? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #18 from sndirsch@novell.com 2007-03-02 07:36 MST ------- (In reply to comment #15)
Stefan, could you try to get the value of rep.devnamelen? If the value is too big this could point to a problem with the endian conversion on the protocol level. (gdb) p /x rep.devnamelen $6 = 0xf000000
We definitely still have an endianess problem in libXxf86misc. :-( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 sndirsch@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sndirsch@novell.com AssignedTo|sndirsch@novell.com |eich@novell.com Status|ASSIGNED |NEW ------- Comment #19 from sndirsch@novell.com 2007-03-02 08:52 MST ------- Reassigning to Egbert. He definitely has the best knowledge of libXxf86misc. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 eich@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #21 from olh@novell.com 2007-05-10 01:34 MST ------- how is the console messed up? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #22 from olh@novell.com 2007-05-11 01:14 MST ------- are you able to boot from CD1? go to the openfirmware cmdline prompt, pressing escape during boot. boot cd suseboot/inst32 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #23 from eich@novell.com 2007-05-11 10:34 MST ------- Olaf, could it be that this happened on a connection between a BE and LE machine? X doesn't convert to network byte order. Instead both ends exchange information on the byte order at connection setup. When both have the same, they don't bother. When both differ the server needs to convert. I'll check - maybe this isn't implemented for the x86misc extension. Olaf, i currently have huge problems with a network installation of factory from an official mirror. Apart from that the kernel that I got from a previous installation didn't boot. It messed up the display and didn't go any further. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #24 from olh@novell.com 2007-05-14 07:52 MST ------- Egbert, everything happend local on the system. I will install 10.3a4 now on mango to see if the memory leak still exists. I'm not sure what 'messed up the display' means? Are the penguins during boot already corrupted? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #25 from eich@novell.com 2007-05-14 12:55 MST ------- The memory leak is likely to still exist I assume. Is this the only BE system you have seen this issue on so far? There are no penguins at boot, the box doesn't even get that far, it looks like a messed up vga text mode with differently colored font boxes. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #26 from olh@novell.com 2007-05-14 12:57 MST ------- Can you try a different monitor or tft? its working for me with two different monitors. Maybe booting with video=radeonfb:1024x768@75 helps. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 olh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|olh@novell.com | ------- Comment #27 from olh@novell.com 2007-05-15 02:16 MST ------- 10.3a4 does not reproduce the memory leak with a 'minimal x' install. Maybe because its fixed or because the codepaths are not taken. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #28 from eich@novell.com 2007-05-15 23:08 MST ------- Olaf, would it be possible that you set up the machine with the exact version that triggers this bug so that I can reproduce this? It would be worthwhile to understand the problem - however it's only possible if I can reproduce it. Otherwise we can close this ticket. I'd have to rebuild some pieces of X (without going thru autobuild) so we'd need all devel packages and sdk installed. I could also build on another machine as long as I can access /mounts/space from both. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #29 from olh@novell.com 2007-05-18 02:38 MST ------- I do not have alpha1plus install media anymore. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #30 from eich@novell.com 2007-05-18 05:47 MST ------- Olaf, what do you think we should do about this? - Nobody else reported this problem. - You cannot reproduce it any more with another system. It's likely a rather obscure problem that cannot easily be identified by looking at the code. On the other hand it's not understood and therefore can pop up any time. I'm not satisfied to say it's fixed because it doesn't surface any more. I'd be more satisfied if I understood it and would make sure it doesn't happen. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809 ------- Comment #31 from olh@novell.com 2007-05-18 06:53 MST ------- Unless you find questionable code in the error paths, we can only close it. Unless someone still has a copy of that old snapshot. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=249809
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=249809#c32
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=249809#c33
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=249809
Olaf Hering
participants (1)
-
bugzilla_noreply@novell.com