[Bug 1016313] New: Base:System/systemd: journal has stopped working in v232
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313 Bug ID: 1016313 Summary: Base:System/systemd: journal has stopped working in v232 Classification: openSUSE Product: openSUSE.org Version: unspecified Hardware: x86-64 OS: SUSE Other Status: NEW Severity: Normal Priority: P5 - None Component: Bugzilla Assignee: systemd-maintainers@suse.de Reporter: scalpel4k@gmail.com QA Contact: novbugzilla-bugs@forge.provo.novell.com Found By: --- Blocker: --- Hi guys, since I updated to v232 my journal stopped working. Starting the service gives an unspecific error message: 'Job for systemd-journald.service failed. ....' I have now tinkered with the unit file and found that the following line is causing trouble. SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io I don't know how to further investigate this, so please let me know what I can do to send in more details. bye Michi -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c1
Franck Bui
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c2
--- Comment #2 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c3
--- Comment #3 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c4
--- Comment #4 from Franck Bui
thanks for your reply. Unfortunately, having a crashing journal service doesn't give you too much insight, even in debug mode ;-)
yeah... I won the dumb question of the week ;) Currently I'm puzzled as it works fine here... Anything relevant in dmesg ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c5
--- Comment #5 from Michael Woski
yeah... I won the dumb question of the week ;) I won't tell anyone ...
Anyway, dmesg is of no help, there's nothing interesting :-( -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c6
--- Comment #6 from Franck Bui
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c7
--- Comment #7 from Michael Woski
Ok then if you agree I'll prepare a testing package tomorrow that will make timesyncd sleeps during 20 secs or so before starting so you'll have enough time to attach to the process with gdb and locate the place that calls a filtered syscall. I could do it on my own by just wrapping it all in a script with a sleep call, right?
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c8
--- Comment #8 from Franck Bui
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c9
--- Comment #9 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c10
--- Comment #10 from Franck Bui
sorry for the long delay. Meanwhile I was able to test your packages, but the whole thing didn't work as intended, I guess. systemd seems to kill the process straightaway, so there's no systemd-timesyncd process to attach to.
Hmm that's strange, IIRC I gave the testing package a try and it worked as
expected.
It's supposed to wait during 30 sec before starting its execution.
Did you try to start systemd-timesyncd manually ? You should make sure that the
service is disabled at boot and start it manually.
Also WatchdogSec=3min for this service so you won't have enough time to debug
before systemd notifies that the service is stuck. As a consequence systemd
will kill it. To disable this:
$ mkdir /etc/systemd/system/systemd-timesyncd.d
$ cat >/etc/systemd/system/systemd-timesyncd.d/no-watchdog.conf<
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c11
--- Comment #11 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c12
--- Comment #12 from Franck Bui
(In reply to Franck Bui from comment #10)
I think I did everything correctly regarding the service setup. As I said, it looks like timesyncd doesn't get started in the first place.
Hmm according to the log you provided in comment #3: Dec 20 17:23:45 linux01.mic.e-concepts.de systemd[31307]: systemd-timesyncd.service: Executing: /usr/lib/systemd/systemd-timesyncd which indicates that systemd has at least executed the binary. But it looks like the (new) process fails very early. Could you try to create a dumb service by duplicating systemd-timesyncd one and then try: 1/ to replace: ExecStart=/usr/lib/systemd/systemd-timesyncd with ExecStart=/usr/lib/systemd/sleep 10000 You'll need to copy /usr/bin/sleep to /usr/lib/systemd/ 2/ if 1/ doesn't exhibit the probleme restore ExecStart=/usr/lib/systemd/systemd-timesyncd 3/ make the service as simple as possible by removing all fields in [Service] section that are not needed to reproduce the issue. 4/ show the output of "strace -c <content of ExecStart>" Also it might be interesting to see what happens if you add "SystemCallErrorNumber=EPERM", I meant if the debug log show something more interesting.
I have openSUSE with systemd 232 running in a few virtual machines without that particular issue.
BTW could you describe your affected setup ? is it a virtual machine ? which arch ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c13
--- Comment #13 from Michael Woski
1/ to replace:
ExecStart=/usr/lib/systemd/systemd-timesyncd
with
ExecStart=/usr/lib/systemd/sleep 10000
You'll need to copy /usr/bin/sleep to /usr/lib/systemd/
works
3/ make the service as simple as possible by removing all fields in [Service] section that are not needed to reproduce the issue.
well, it's the @mount option in the SystemCallFilter. Removing all other properties does not help as long as that @mount option is set
4/ show the output of "strace -c <content of ExecStart>"
see attached files
Also it might be interesting to see what happens if you add "SystemCallErrorNumber=EPERM", I meant if the debug log show something more interesting.
it's working, but I would have expected that.
BTW could you describe your affected setup ? is it a virtual machine ? which arch ?
x86_64, "real" machine, openSUSE Factory -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c14
--- Comment #14 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c15
--- Comment #15 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c16
--- Comment #16 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c17
--- Comment #17 from Franck Bui
Created attachment 708582 [details] strace
The (s)trace here is uninteresting: execve("/usr/bin/systemctl", ["systemctl", "start", "systemd-timesyncd.service"], [/* 116 vars */]) = 0 you did strace systemctl... the interesting process in your case is systemd-timesyncd which is started by systemd (PID1). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c18
--- Comment #18 from Franck Bui
funny, I can't find where I can upload attachments :-P
Job for systemd-timesyncd.service failed because a fatal signal was delivered to the control process. See "systemctl status systemd-timesyncd.service" and "journalctl -xe" for details. % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 0.00 0.000000 0 23 read
How did you get this output exactly ? Did you issue "strace -c /usr/lib/systemd/systemd-timesyncd" to get this summary ? Or did you issue "systemctl status systemd-timesyncd" ? Or something else ? Please next time show the command line you used.
[...] 0.00 0.000000 0 1 execve 0.00 0.000000 0 2 kill 0.00 0.000000 0 1 getrlimit 0.00 0.000000 0 3 geteuid 0.00 0.000000 0 2 1 statfs 0.00 0.000000 0 1 arch_prctl 0.00 0.000000 0 1 1 mount
Here's the mount syscall which triggers the killing signal. Could you remove the "-c" option passed to strace and redo the same test ? Thanks. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c19
--- Comment #19 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c20
--- Comment #20 from Franck Bui
For clarity, this is what I did: strace -c systemctl start systemd-timesyncd
That's not what I was asking for in comment #12. I was asking for the output of: "strace -c /usr/lib/systemd/systemd-timesyncd" -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c21
--- Comment #21 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c22
--- Comment #22 from Franck Bui
Created attachment 708603 [details] strace -c /usr/lib/systemd/systemd-timesyncd
Thanks. Please now show the output of "strace /usr/lib/systemd/systemd-timesyncd" -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c23
--- Comment #23 from Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c24
--- Comment #24 from Franck Bui
execve("/usr/lib/systemd/systemd-timesyncd", ["/usr/lib/systemd/systemd-timesyn"...], [/* 116 vars */]) = 0 ... statfs("/sys/fs/selinux", 0x7ffec2cac480) = -1 ENOENT (No such file or directory) statfs("/selinux", {f_type=BTRFS_SUPER_MAGIC, f_bsize=4096, f_blocks=31204254, f_bfree=29339042, f_bavail=29153112, f_files=0, f_ffree=0, f_fsid={val=[2664197864, 450012966]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_NOATIME}) = 0 mount("proc", "/proc", "proc", 0, NULL) = -1 EBUSY (Device or resource busy)
This call to mount(2) is done when ld is loading and initializing the share libs used by systemd-timesyncd. libselinux is one of them and is the one which does the mount(2) syscall.
From libselinux changelog:
$ rpm -q libselinux1 libselinux1-2.5-2.5.x86_64 $ rpm -q --changelog libselinux1 [...] * Sun Jul 24 2016 crrodriguez@opensuse.org - Avoid mounting /proc outside of selinux_init_load_policy(). (Stephen Smalley) reverts upstream 5a8d8c4, 9df4988, fixes among other things systemd seccomp sandboxing otherwise all filters must allow mount(2) (libselinux-proc-mount-only-if-needed.patch) [...] So it seems that this issue has been addressed already... Can you check the version of this lib installed on your system and make sure that the version you're using includes this fix ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c25
Michael Woski
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c26
Rainer Klier
(In reply to Franck Bui from comment #24)
that was the reason indeed. I had an rpm with version 2.5-1.1 shadowing the latest version from Factory. Sorry for the trouble and many thanks for your help and patience.
this bug also affects me. i also have libselinux1-2.5-1.1.x86_64 installed. i am currently updating to verify that it fixes the problem for me. thanks. i think, that systemd 232 should have a dependency to libselinux1-2.5-2.5, and should not be installed without libselinux1-2.5-2.5. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313
http://bugzilla.opensuse.org/show_bug.cgi?id=1016313#c27
--- Comment #27 from Rainer Klier
(In reply to Michael Woski from comment #25)
(In reply to Franck Bui from comment #24)
this bug also affects me. i also have libselinux1-2.5-1.1.x86_64 installed. i am currently updating to verify that it fixes the problem for me.
it is fixed with updating libselinux1. :-D -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com