[Bug 915575] New: Boot fails on systemd-journald recieving SIGTERM
http://bugzilla.suse.com/show_bug.cgi?id=915575 Bug ID: 915575 Summary: Boot fails on systemd-journald recieving SIGTERM Classification: openSUSE Product: openSUSE Factory Version: 201501* Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: systemd-maintainers@suse.de Reporter: tchvatal@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Created attachment 621475 --> http://bugzilla.suse.com/attachment.cgi?id=621475&action=edit Screenshot of the boot with sigterm Boot of current kernels 3.18+ fails with systemd-journald getting SIGTERM from pid 1. This happens only on SSD drive and when I migrate 1:1 to normal HDD it boots fine. I tested on 2 disks one is Kingston and other is from Intel both having this problem. Further testing revealed when you replace systemd-journald (systemd-logger) with rsyslog the machine boots just fine, so it is something with interaction sdd/systemd/systemd-journald that goes amiss. I would love to provide more debug info but really I didn't manage to get more debug stuff than this image. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tchvatal@suse.com, | |thomas.blume@suse.com Flags| |needinfo?(tchvatal@suse.com | |) --- Comment #1 from Thomas Blume <thomas.blume@suse.com> --- Can you try debug logging to a serial console, e.g.: systemd.log_level=debug systemd.log_target=console console=ttyS0,38400 as boot parameter? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Martin Pluskal <mpluskal@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mpluskal@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Tomáš Chvátal <tchvatal@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED Flags|needinfo?(tchvatal@suse.com | |) | --- Comment #2 from Tomáš Chvátal <tchvatal@suse.com> --- Ok, did some debugging and then digged around web and found RH already fixed this. see sr#283651 for dracut. Fixed it for me on all reproducable instances. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Tomáš Chvátal <tchvatal@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED |--- Flags| |needinfo?(tchvatal@suse.com | |) --- Comment #3 from Tomáš Chvátal <tchvatal@suse.com> --- Reopening, this partly fixes the issue. It is still present with systemd-210, systemd-218 is okay thus I was not reproducing it on my machines, but my work pc is still our systemd-210 and is able to trigger it even with the dracut patch, that fixed it for 218. I will provide the logs requested today/tomorrow. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 --- Comment #4 from Dr. Werner Fink <werner@suse.com> --- (In reply to Tomáš Chvátal from comment #3) Please give the latest systemd-210 from Base:System:Legacy/systemd a try. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 --- Comment #5 from Tomáš Chvátal <tchvatal@suse.com> --- Created attachment 621582 --> http://bugzilla.suse.com/attachment.cgi?id=621582&action=edit systemd log directed to terminal -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Tomáš Chvátal <tchvatal@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(tchvatal@suse.com | |) | --- Comment #6 from Tomáš Chvátal <tchvatal@suse.com> --- Created attachment 621583 --> http://bugzilla.suse.com/attachment.cgi?id=621583&action=edit systemd debug logging -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(tchvatal@suse.com | |) --- Comment #7 from Thomas Blume <thomas.blume@suse.com> --- (In reply to Tomáš Chvátal from comment #6)
Created attachment 621583 [details] systemd debug logging
Hm, I don't really see a bug. There are 2 times signals sent to journald. The first time it seems to be caused by the switch from initrd to real system root: -->-- [ 5.084442] BTRFS info (device sda3): disk space caching is enabled [ 5.099132] BTRFS: detected SSD devices, enabling SSD mode [ 5.293630] systemd-journald[184]: Received SIGTERM from PID 1 (systemd). [ 5.491167] BTRFS info (device sda3): disk space caching is enabled --<-- Is sda3 your system root? If so, this SIGTERM is normal. After switch root the journal is restarted on the real root device. The other signal I can see is a trigger for a journal flush (see man systemd-journald.service > SIGNALS for reference): -->-- About to execute: /usr/bin/systemctl kill --kill-who=main --signal=SIGUSR1 systemd-journald.service Forked /usr/bin/systemctl as 474 systemd-journal-flush.service changed dead -> start --<-- This indicates that journald was correctly started before. What shows: systemctl status journald in the running system? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Tomáš Chvátal <tchvatal@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Boot fails on |Boot fails on kernel-3.18 |systemd-journald recieving |with systemd-210 and ssd |SIGTERM | Flags|needinfo?(tchvatal@suse.com | |) | --- Comment #8 from Tomáš Chvátal <tchvatal@suse.com> --- Yeah seems my first assessment that journald is to be blamed was wrong... But still some blk problem stands as it fails to boot with systemd-210 as per the logs. Updated the bug summary accordingly, not sure wether we should let the kernel guys look or if it is soely systemd bug. FWIW the journald status: dreamcrawler:~ # LC_ALL=C systemctl status systemd-journald systemd-journald.service - Journal Service Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static) Active: active (running) since Mon 2015-02-02 16:51:08 CET; 2min 17s ago Docs: man:systemd-journald.service(8) man:journald.conf(5) Main PID: 408 (systemd-journal) Status: "Processing requests..." CGroup: /system.slice/systemd-journald.service `-408 /usr/lib/systemd/systemd-journald Feb 02 16:51:08 dreamcrawler systemd-journal[408]: Runtime journal is using 8.0M (max allowed 390.8M, trying to leave 586.3M free of 3.8G available → current limit 390.8M). Feb 02 16:51:08 dreamcrawler systemd-journal[408]: Runtime journal is using 8.0M (max allowed 390.8M, trying to leave 586.3M free of 3.8G available → current limit 390.8M). Feb 02 16:51:08 dreamcrawler systemd-journal[408]: Journal started Feb 02 16:51:16 dreamcrawler systemd-journal[408]: Permanent journal is using 1.1G (max allowed 3.9G, trying to leave 4.0G free of 31.7G available → current limit 3.9G). Feb 02 16:51:16 dreamcrawler systemd-journal[408]: Time spent on flushing to /var is 164.148ms for 981 entries. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 --- Comment #9 from Thomas Blume <thomas.blume@suse.com> --- Yeah, there is a problem due to device timeout: -->-- Job dev-disk-by\x2duuid-d256e954\x2df026\x2d417a\x2da6cf\x2d472093b98a5e.device/start timed out. Job dev-disk-by\x2duuid-d256e954\x2df026\x2d417a\x2da6cf\x2d472093b98a5e.device/start finished, result=timeout --<-- Which is pretty strange, because btrfs can apparently see the device: -->-- Got [ 4.700082] BTRFS: device fsid d256e954-f026-417a-a6cf-472093b98a5e message type=sigdevid 1 transid 29954 /dev/sda3 --<-- The system seems to go into the emergency mode. Can you please also add the following boot parameters: debug rd.debug udev.log-priority=debug rd.udev.log-priority=debug and attach /run/initramfs/rdsosreport.txt when it is there? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(tchvatal@suse.com | |) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=915575 Tomáš Chvátal <tchvatal@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution|--- |FIXED Flags|needinfo?(tchvatal@suse.com | |) | --- Comment #11 from Tomáš Chvátal <tchvatal@suse.com> --- As we bumped the systemd this problem becames moot. The system now always boots up correctly. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com