http://bugzilla.opensuse.org/show_bug.cgi?id=918226
Bruno Prémont changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |bruno.premont@restena.lu
--- Comment #33 from Bruno Prémont ---
Got it here too on a serer (virtual, under VMWare) which is executing a rather
large amount of nrpe checks (and thus sees a lot of batched forking)
glbc complains via kernel log:
systemd[1]: segfault at 1010514 ip 000000000047912e sp 00007fff9a1c5670 error 4
in systemd[400000+ed000]
Started happending with update from systemd-208-23.3.x86_64 to
systemd-208-28.1.x86_64.
Looking up IP via addr2line using debuginfo&debugsource packages I get:
addr2line -e /usr/lib/systemd/systemd 0x47912e
/usr/src/debug/systemd-208/src/core/unit.c:1682
/usr/src/debug/systemd-208/src/core/unit.c
1677:
1678: void unit_unwatch_pid(Unit *u, pid_t pid) {
1679: assert(u);
1680: assert(pid >= 1);
1681:
1682: hashmap_remove_value(u->manager->watch_pids, LONG_TO_PTR(pid),
u);
1683: set_remove(u->pids, LONG_TO_PTR(pid));
1684: }
1685:
This seems to match report from comment #8 with NULL u->manager.
Once systemd has crashed nrpe zombies start piling up until kernel refuses more
processes (clone() returns -1 with errno=EAGAIN) due to rlimit on per-user
process count.
One possible reason why nrpe triggers this bug more than anything else is that
it forks a few levels deep for each check and seems to have some of its
children reparented to init. nrpe is running as a daemon and not xinetd
service.
--
You are receiving this mail because:
You are on the CC list for the bug.