[Bug 871704] New: every second boot fails during mount of a /etc/fstab entry
https://bugzilla.novell.com/show_bug.cgi?id=871704 https://bugzilla.novell.com/show_bug.cgi?id=871704#c0 Summary: every second boot fails during mount of a /etc/fstab entry Classification: openSUSE Product: openSUSE 13.1 Version: Final Platform: x86-64 OS/Version: openSUSE 13.1 Status: NEW Severity: Major Priority: P5 - None Component: Bootloader AssignedTo: jsrain@suse.com ReportedBy: aotto1968@t-online.de QAContact: jsrain@suse.com Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36 Hello, 1. I was upgrading from 12.3 to 13.1 using "zypper dup" (the first time) 2. the upgrade went well. 3. I have the boot device on a SSD and 2 additional HD's for mass storage systemd[1]: Expecting device~ ~dev-disk-by\x2did-ata\x2dSamsung_SSD_840_PRO_Series_S12SNEAD207940W\x2dpart1.device... ~dev-disk-by\x2duuid-b499aac0\x2d0d98\x2d4b35\x2dadef\x2d91711c637e2b.device... ~dev-disk-by\x2duuid-7a8d375e\x2d4be0\x2d4f83\x2d978d\x2d450b6eccfd20.device... 4. the SSD has the root partition -> NO problem 5. the both other have the backup partition based on LVM
PROBLEM-1 <<<
6. the system hangs after every ~second boot waiting for mount of one of the LVM volume's 7. it waits it waits ... and after a while it changes into the emergency mode
PROBLEM-2 <<<
8. BUT i don't get a login-prompt in emergency mode 9. I have to reset the hardware 10. The only solution is to add "noauto" to the both devices (from the end) ====================================================================== # less /etc/fstab /dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_S12SNEAD207940W-part1 swap swap defaults 0 0 /dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_S12SNEAD207940W-part2 / ext4 acl,user_xattr 1 1 debugfs /sys/kernel/debug debugfs noauto 0 0 usbfs /proc/bus/usb usbfs noauto 0 0 UUID=b499aac0-0d98-4b35-adef-91711c637e2b /backup/linux02-1 btrfs noauto,defaults 0 0 UUID=7a8d375e-4be0-4f83-978d-450b6eccfd20 /backup/linux02-2 btrfs noauto,defaults 0 0 /dev/backup-2/windows01 /backup/windows01-2 ext4 noauto,acl,user_xattr 1 2 /dev/backup-1/windows01 /backup/windows01-1 ext4 noauto,acl,user_xattr 1 2 /dev/lxc/lxc_nhi2 /mount/lxc/lxc_nhi2 ext4 noauto,ro,acl,user_xattr 1 2 /dev/lxc/lxc_portal /mount/lxc/lxc_portal ext4 noauto,ro,acl,user_xattr 1 2 ======================================================================= Reproducible: Sometimes Steps to Reproduce: 1. see details 2. 3. Actual Results: hanging system Expected Results: booting system the problem is during setup of "local_fs" this mean I don't have any logfiles to submit -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c1
Andreas Otto
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36
Hello,
1. I was upgrading from 12.3 to 13.1 using "zypper dup" (the first time) 2. the upgrade went well. 3. I have the boot device on a SSD and 2 additional HD's for mass storage
systemd[1]: Expecting device~ ~dev-disk-by\x2did-ata\x2dSamsung_SSD_840_PRO_Series_S12SNEAD207940W\x2dpart1.device... ~dev-disk-by\x2duuid-b499aac0\x2d0d98\x2d4b35\x2dadef\x2d91711c637e2b.device... ~dev-disk-by\x2duuid-7a8d375e\x2d4be0\x2d4f83\x2d978d\x2d450b6eccfd20.device...
4. the SSD has the root partition -> NO problem 5. the both other have the backup partition based on LVM
PROBLEM-1 <<<
6. the system hangs after every ~second boot waiting for mount of one of the LVM volume's 7. it waits it waits ... and after a while it changes into the emergency mode
PROBLEM-2 <<<
8. BUT i don't get a login-prompt in emergency mode 9. I have to reset the hardware 10. The only solution is to add "noauto" to the both devices (from the end)
====================================================================== # less /etc/fstab /dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_S12SNEAD207940W-part1 swap swap defaults 0 0 /dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_S12SNEAD207940W-part2 / ext4 acl,user_xattr 1 1 debugfs /sys/kernel/debug debugfs noauto 0 0 usbfs /proc/bus/usb usbfs noauto 0 0 UUID=b499aac0-0d98-4b35-adef-91711c637e2b /backup/linux02-1 btrfs noauto,defaults 0 0 UUID=7a8d375e-4be0-4f83-978d-450b6eccfd20 /backup/linux02-2 btrfs noauto,defaults 0 0 /dev/backup-2/windows01 /backup/windows01-2 ext4 noauto,acl,user_xattr 1 2 /dev/backup-1/windows01 /backup/windows01-1 ext4 noauto,acl,user_xattr 1 2 /dev/lxc/lxc_nhi2 /mount/lxc/lxc_nhi2 ext4 noauto,ro,acl,user_xattr 1 2 /dev/lxc/lxc_portal /mount/lxc/lxc_portal ext4 noauto,ro,acl,user_xattr 1 2
=======================================================================
Reproducible: Sometimes
Steps to Reproduce: 1. see details 2. 3. Actual Results: hanging system
Expected Results: booting system
the problem is during setup of "local_fs" this mean I don't have any logfiles to submit
Now I have done additional tests: linux02:~ # systemctl status /backup/linux02-1 backup-linux02\x2d1.mount - /backup/linux02-1 Loaded: loaded (/etc/fstab) Active: failed (Result: exit-code) since Thu 2014-04-03 21:38:48 CEST; 6min ago Where: /backup/linux02-1 What: /dev/backup-1/linux02 Process: 1069 ExecMount=/bin/mount /dev/backup-1/linux02 /backup/linux02-1 -t btrfs -o nofail,defaults,noatime,noexec,acl (code=exited, status=32) Apr 03 21:38:48 linux02 systemd[1]: Mounting /backup/linux02-1... Apr 03 21:38:48 linux02 systemd[1]: backup-linux02\x2d1.mount mount process exited, code=exited status=32 Apr 03 21:38:48 linux02 systemd[1]: Failed to mount /backup/linux02-1. Apr 03 21:38:48 linux02 systemd[1]: Unit backup-linux02\x2d1.mount entered failed state. =============================================================== linux02:~ # systemctl status /backup/linux02-2 backup-linux02\x2d2.mount - /backup/linux02-2 Loaded: loaded (/etc/fstab) Active: failed (Result: exit-code) since Thu 2014-04-03 21:38:48 CEST; 9min ago Where: /backup/linux02-2 What: /dev/backup-2/linux02 Process: 1060 ExecMount=/bin/mount /dev/backup-2/linux02 /backup/linux02-2 -t btrfs -o nofail,defaults,noatime,noexec,acl (code=exited, status=32) Apr 03 21:38:48 linux02 systemd[1]: backup-linux02\x2d2.mount mount process exited, code=exited status=32 Apr 03 21:38:48 linux02 systemd[1]: Failed to mount /backup/linux02-2. Apr 03 21:38:48 linux02 systemd[1]: Unit backup-linux02\x2d2.mount entered failed state. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c5
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c6
Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c
Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c7
--- Comment #7 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c8
Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c9
--- Comment #9 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c10
--- Comment #10 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c11
--- Comment #11 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c12
--- Comment #12 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c13
--- Comment #13 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c14
--- Comment #14 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c15
--- Comment #15 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c16
--- Comment #16 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c17
--- Comment #17 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c18
--- Comment #18 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c19
--- Comment #19 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c20
--- Comment #20 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c21
--- Comment #21 from Andrey Borzenkov
Apr 04 13:26:24 linux02 systemd-udevd[734]: timeout: killing '/sbin/lvm pvscan --cache --activate ay --major 8 --minor 16' [843] [...] skipped
That explains why LVs are not started. Could you add "--debug" to invocation of pvscan in /usr/lib/udev/rules.d/69-dm-lvm-metad.rules and attach journalctl output next time you get a problem? Also check BIOS settings - if you have floppy controller, try to completely deactivate it. (In reply to comment #17)
linux02:~ # systemctl status | grep lvm2-lvmetad.socket └─6551 grep --color=auto lvm2-lvmetad.socket
Wrong. Please read manual page for systemctl. It is either systemctl (without any options) or "systemctl status lvm2-lvmetad.socket". -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c22
--- Comment #22 from Andreas Otto
# /usr/bin/systemctl disable lvm2-lvmetad.socket rm '/etc/systemd/system/sockets.target.wants/lvm2-lvmetad.socket' # ls -al /etc/systemd/system/sockets.target.wants total 8 drwxr-xr-x 2 root root 4096 Apr 4 18:39 . drwxr-xr-x 12 root root 4096 Apr 4 14:22 .. lrwxrwxrwx 1 root root 43 Mar 6 2013 avahi-daemon.socket -> /usr/lib/systemd/system/avahi-daemon.socket lrwxrwxrwx 1 root root 35 Mar 6 2013 cups.socket -> /usr/lib/systemd/system/cups.socket lrwxrwxrwx 1 root root 36 Mar 28 21:07 pcscd.socket -> /usr/lib/systemd/system/pcscd.socket lrwxrwxrwx 1 root root 38 Jun 14 2013 rpcbind.socket -> /usr/lib/systemd/system/rpcbind.socket
2 ###
# cat /usr/lib/udev/rules.d/69-dm-lvm-metad.rules ... SUBSYSTEM!="block", GOTO="lvm_end"
# Device-mapper devices are processed only on change event or on supported synthesized event. KERNEL=="dm-[0-9]*", ENV{DM_UDEV_RULES_VSN}!="?*", GOTO="lvm_end" # Only process devices already marked as a PV - this requires blkid to be called before. ENV{ID_FS_TYPE}=="LVM2_member|LVM1_member", RUN+="/sbin/lvm pvscan --debug --cache --activate ay --major $major --minor $minor" LABEL="lvm_end" after next 3 ### linux02:~ # systemctl | grep -i lvm lvm2-lvmetad.service loaded active running LVM2 metadata daemon lvm2-lvmetad.socket loaded active running LVM2 metadata daemon socket => next boot we know more -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c23
--- Comment #23 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c24
--- Comment #24 from Andrey Borzenkov
seems the same stuff a last event:
Try increasing debug verbosity: -dddddd instead of --debug. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c25
--- Comment #25 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c26
--- Comment #26 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c27
--- Comment #27 from Andrey Borzenkov
=> the debug itself does bring anything ... the process stopps bevor anything interesting happen
debug output goes to syslog. According to strace, there should be quite a lot of it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c28
--- Comment #28 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c29
Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c30
--- Comment #30 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c31
--- Comment #31 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c32
--- Comment #32 from Andrey Borzenkov
Andrey, any objections?
This is a bug in transaction token handling (fixed couple of months ago) which is triggered by multiple concurrent scans. I will prepare test package today (or you can do it :p if you like) https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=1769eddde7dcdd16716d64... I still believe switch to lvmetad was not appropriate for a maintenance update. It has far too many problems. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c33
--- Comment #33 from Andrey Borzenkov
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c34
--- Comment #34 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c35
--- Comment #35 from Michael Bednowicz
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c36
--- Comment #36 from Andreas Otto
cat /etc/fstab /dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_S12SNEAD207940W-part1 swap swap defaults 0 0 /dev/disk/by-id/ata-Samsung_SSD_840_PRO_Series_S12SNEAD207940W-part2 / ext4 discard,defaults,noatime,acl,user_xattr 1 1 debugfs /sys/kernel/debug debugfs noauto 0 0 usbfs /proc/bus/usb usbfs noauto 0 0 /dev/backup-1/linux02 /backup/linux02-1 btrfs nofail,defaults,noatime,noexec 0 0 /dev/backup-1/nhi2 /backup/nhi2-1 btrfs nofail,defaults,noatime,noexec 0 0 /dev/backup-1/portal /backup/portal-1 btrfs nofail,defaults,noatime,noexec 0 0 /dev/backup-1/windows01 /backup/windows01-1 ext4 nofail,defaults,noatime,noexec,acl 1 2 /dev/backup-2/linux02 /backup/linux02-2 btrfs nofail,defaults,noatime,noexec 0 0 /dev/backup-2/windows01 /backup/windows01-2 ext4 nofail,defaults,noatime,noexec,acl 1 2 /dev/lxc/lxc_nhi2 /mount/lxc/lxc_nhi2 ext4 nofail,noauto,defaults,noatime,noexec,acl,user_xattr,ro 1 2 /dev/lxc/lxc_portal /mount/lxc/lxc_portal ext4 nofail,noauto,defaults,noatime,noexec,acl,user_xattr,ro 1 2
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c37
--- Comment #37 from Andrey Borzenkov
some mount show up as ... "/dev/mapper..." other as "/dev/dm..."
That does not matter. They all are aliases for the same device. I do not know whether this is systemd, mount or df who resolves them, but would ignore it at this point. Some more reboots to verify would be good :) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c38
--- Comment #38 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c39
--- Comment #39 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c40
--- Comment #40 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c41
--- Comment #41 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c42
--- Comment #42 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c43
--- Comment #43 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c44
--- Comment #44 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c45
--- Comment #45 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c46
--- Comment #46 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c47
--- Comment #47 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c48
--- Comment #48 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c49
--- Comment #49 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c50
--- Comment #50 from Andrey Borzenkov
Also, ich glaube das Problem hängt mit meinen lxc's zusammen !!
This drifted way offtopic here. Your original system was clearly physical system. If you have problem running anything in LXC, please open separate bug report, clearly describing your problem and environment. And please answer - is the issue you originally reported in this bug fixed with my package on a system where it was originally encountered? Apr 06 20:09:50 linux02 kernel: DMI: Intel Corporation S1200BTL/S1200BTL, BIOS S1200BT.86B.02.00.0035.030220120927 03/02/2012 Do not forget to check "This comment provides requested information". This bug is still in the status NEEDINFO from you. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c51
Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c52
--- Comment #52 from Andreas Otto
Can you check wheter:
/sys/fs/cgroup/systemd/system.slice/lvm2-lvmetad.service
is accessible from within the linux container?
from the container: # ls -al /sys/fs/cgroup/systemd/system.slice/lvm2-lvmetad.service total 0 drwxr-xr-x 2 root root 0 Apr 17 05:24 . drwxr-xr-x 80 root root 0 Apr 17 05:39 .. -rw-r--r-- 1 root root 0 Apr 17 05:24 cgroup.clone_children --w--w--w- 1 root root 0 Apr 17 05:24 cgroup.event_control -rw-r--r-- 1 root root 0 Apr 17 05:24 cgroup.procs -rw-r--r-- 1 root root 0 Apr 17 05:24 notify_on_release -rw-r--r-- 1 root root 0 Apr 17 05:24 tasks => as I mentioned before ... under "normal" conditions lxc works quite well. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c53
Thomas Blume
+ halt Failed to open /dev/initctl: No such device or address Failed to talk to init daemon.
what is the output of: systemctl status systemd-initctl.socket ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c54
--- Comment #54 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c55
--- Comment #55 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c56
--- Comment #56 from Andreas Otto
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c57
--- Comment #57 from Andrey Borzenkov
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c58
--- Comment #58 from Andrey Borzenkov
Apr 19 09:27:29 linux-qqqf systemd-udevd[247]: '/sbin/lvm pvscan --cache --activate ay --major 8 --minor 33' [870] terminated by signal 6 (Aborted)
That's unrelated bug when there is no VG on PV. I opened https://bugzilla.novell.com/show_bug.cgi?id=874396 for it. With this fixed I unfortunately cannot reproduce the problem again. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c
Andrey Borzenkov
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c59
--- Comment #59 from Andrey Borzenkov
today the mounts are missing ... the old bug is back ...
Apr 18 18:02:58 linux02 systemd-udevd[756]: timeout: killing '/sbin/lvm pvscan --cache --activate ay --major 8 --minor 16' [841] Apr 18 18:02:58 linux02 systemd-udevd[756]: '/sbin/lvm pvscan --cache --activate ay --major 8 --minor 16' [841] terminated by signal 9 (Killed) Apr 18 18:02:58 linux02 systemd-udevd[751]: timeout: killing '/sbin/lvm pvscan --cache --activate ay --major 8 --minor 32' [840] Apr 18 18:02:58 linux02 systemd-udevd[751]: '/sbin/lvm pvscan --cache --activate ay --major 8 --minor 32' [840] terminated by signal 9 (Killed)
So we are back at square one. What we are facing here is apparent lvmetad deadlock. There are two upstream patches that fix possible deadlock case there. I added them to my package and updated it. You could test it. What would be really helpful - before updating try to trigger the problem and provide lvmetad backtrace. This would show where it is stuck. Find out lvmetad PID (e.g. using "pgrep -l lvmetad") and then use gdb to obtain backtrace with "thread apply all backtrace", like bor@opensuse:~> pgrep -l lvmetad 557 lvmetad bor@opensuse:~> sudo gdb -p 557 root's password: GNU gdb (GDB; openSUSE 13.1) 7.6.50.20130731-cvs Copyright (C) 2013 Free Software Foundation, Inc. .. (gdb) thread apply all backtrace Thread 1 (Thread 0x7f23a2ff3800 (LWP 557)): #0 0x00007f23a20b9913 in select () at ../sysdeps/unix/syscall-template.S:81 #1 0x0000000000406332 in daemon_start () #2 0x0000000000403112 in main () (gdb) Note that gdb will most likely warn you that debugging information is missing and will suggest zypper commands to install it. You will need to have debug repos configured for it. Please install all requested debugging packages before generating backtrace. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c60
Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c61
Thomas Blume
I don't get an emergency shell to fix or investigate.
can you try with the newer systemd version 210 from: http://download.opensuse.org/repositories/Base:/System/openSUSE_Factory/ this has several fixes that should provide the emergency shell. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c62
--- Comment #62 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c63
--- Comment #63 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c64
--- Comment #64 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c65
--- Comment #65 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c66
--- Comment #66 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c67
--- Comment #67 from Andrey Borzenkov
I understand the nofail workaround, but is there a fix on its way?
Well ... Fix (or at least workaround that makes it disappear under most common conditions) is available since 30 November '13. Whether it will be released as update for 13.1 is up to systemd maintainers.
Test your packages means executing the 3 zypper commands from comment #33 right?
Yes. Although I'm not sure whether there was any LVM update since then; it may be based on slightly outdated version. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c68
--- Comment #68 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c69
Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c70
--- Comment #70 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c71
Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c72
--- Comment #72 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c73
Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c74
--- Comment #74 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c75
--- Comment #75 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c76
--- Comment #76 from Joachim Banzhaf
Are you sure that the initrd matches the kernel you are booting?
Yes, pretty sure. I trust the opensuse installation does the right thing there. As I wrote in the forum, I installed another opensuse 13.1 instance, even with current network updates. I used the same /boot (/dev/sda1, ext3, 500 MB, not formatting) and a new lv for root in the same vg as the original system. This method has worked on earlier occations on other servers (but not tried before with v13.1). /boot/grub2/grub.conf menuentry kernel and initrd are both 3.11.10-17-desktop They have the timestamp from the install.
Before you build a new initrd, please double check that /boot is mounted. Otherwise you will boot the new kernel with an old initrd. This will cause a load failure of most modules due to kernel version mismatch.
I do not know how to run mkinitrd on that defect system myself. I tried from the 13.1 dvd rescue system, but it says "getopts: usage: getopts optstring name [arg]" no matter what parameters I call it with. I did run mkinitrd for integrating new boot filesystem modules before on other systems. But only from the "native", more or less healthy os.
If this doesn't help, could you attach a serial console log from the boot? In addition, please attach the initrd you were using.
Difficult. I have an usb to serial adapter but nowhere to attach it to. But I'll try to get the initrd off of the server. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c77
--- Comment #77 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c78
--- Comment #78 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c79
--- Comment #79 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c80
--- Comment #80 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c81
--- Comment #81 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c82
--- Comment #82 from Thomas Blume
https://bugzilla.novell.com/show_bug.cgi?id=871704
https://bugzilla.novell.com/show_bug.cgi?id=871704#c83
--- Comment #83 from Joachim Banzhaf
http://bugzilla.novell.com/show_bug.cgi?id=871704
--- Comment #109 from Liuhua Wang
(In reply to Liuhua Wang from comment #105)
Can we close this and open a new bug duplicated to bsc#905690?
You are not authorized to access bug #905690.
Sorry, I just realized that you cannot access the L3 bugs from customers, which is for respect the privacy of our customers, sorry for the inconvenience. Would you please open a new bug for your issues? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=871704
Liuhua Wang
participants (1)
-
bugzilla_noreply@novell.com