Bruno Pr�mont changed bug 918226
What Removed Added
CC   bruno.premont@restena.lu

Comment # 33 on bug 918226 from
Got it here too on a serer (virtual, under VMWare) which is executing a rather
large amount of nrpe checks (and thus sees a lot of batched forking)

glbc complains via kernel log:
systemd[1]: segfault at 1010514 ip 000000000047912e sp 00007fff9a1c5670 error 4
in systemd[400000+ed000]


Started happending with update from systemd-208-23.3.x86_64 to
systemd-208-28.1.x86_64.

Looking up IP via addr2line using debuginfo&debugsource packages I get:
addr2line -e /usr/lib/systemd/systemd 0x47912e
/usr/src/debug/systemd-208/src/core/unit.c:1682

/usr/src/debug/systemd-208/src/core/unit.c
1677: 
1678: void unit_unwatch_pid(Unit *u, pid_t pid) {
1679:         assert(u);
1680:         assert(pid >= 1);
1681:
1682:         hashmap_remove_value(u->manager->watch_pids, LONG_TO_PTR(pid),
u);
1683:         set_remove(u->pids, LONG_TO_PTR(pid));
1684: }
1685:

This seems to match report from comment #8 with NULL u->manager.


Once systemd has crashed nrpe zombies start piling up until kernel refuses more
processes (clone() returns -1 with errno=EAGAIN) due to rlimit on per-user
process count.

One possible reason why nrpe triggers this bug more than anything else is that
it forks a few levels deep for each check and seems to have some of its
children reparented to init. nrpe is running as a daemon and not xinetd
service.


You are receiving this mail because: