[Bug 679671] New: 11.4 At shutdown or reboot reiserfs is not unmounted - causing replay of journal at startup
https://bugzilla.novell.com/show_bug.cgi?id=679671 https://bugzilla.novell.com/show_bug.cgi?id=679671#c0 Summary: 11.4 At shutdown or reboot reiserfs is not unmounted - causing replay of journal at startup Classification: openSUSE Product: openSUSE 11.4 Version: Factory Platform: x86-64 OS/Version: SuSE Other Status: NEW Severity: Major Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: 459874983@gmx.net QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Opera/9.80 (X11; Linux x86_64; U; de) Presto/2.7.62 Version/11.01 Installed 11.4 - plain version (Previous version 11.3, replaced root i.e. NO update) In 11.4 mounted a newly formatted local reiserfs disk in /home/juergen/Daten. (In 11.3. the mountpoint was taken by a NFS share - so no statement can be given whether 11.3 suffers from the same problem, at least is was not present with NFS mount there) Did a fsck on the reiserfs - no problems reported Current fstab: /dev/disk/by-id/ata-INTEL_SSDSA2M080G2GC_CVPO0055031J080BGN-part1 swap swap defaults 0 0 /dev/disk/by-id/ata-INTEL_SSDSA2M080G2GC_CVPO0055031J080BGN-part2 / ext4 acl,user_xattr 1 1 /dev/disk/by-id/ata-INTEL_SSDSA2M080G2GC_CVPO0055031J080BGN-part3 /home ext4 defaults 1 2 proc /proc proc defaults 0 0 sysfs /sys sysfs noauto 0 0 debugfs /sys/kernel/debug debugfs noauto 0 0 usbfs /proc/bus/usb usbfs noauto 0 0 devpts /dev/pts devpts mode=0620,gid=5 0 0 /dev/disk/by-id/ata-SAMSUNG_HD154UI_S1XWJ1CZ502895-part1 /home/juergen/Daten reiserfs rw,sync,relatime 1 2 Right from the beginning I noticed the journal being replayed to sdb1 at startup. Searched and found a similar looking bug in 11.2 where the workaround to change /etc/sysconfig/boot to contain: RUN_PARALLEL="no" That helped. Before that I couldn't see the unmount message while shutting down. With that change the message is there (Difficult to see anyhow as the whole process runs so fast) Reproducible: Always Steps to Reproduce: See above -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c1
--- Comment #1 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c2
Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c3
Peter Conrad
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c4
--- Comment #4 from Peter Conrad
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c5
--- Comment #5 from Joachim Banzhaf
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c6
Markus Abt
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c
Markus Abt
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c7
--- Comment #7 from Markus Abt
* (for some reason) blogd receives SIGSYS
The signal is intentionally sent by the /etc/init.d/rc script: <snip> # # Do never call startpar for single, halt or reboot script # case "$RUNLEVEL" in S|0|1|6) DO_CONFIRM=no RUN_PARALLEL=no killproc -SYS /sbin/blogd esac </snip> After that, rc starts the scripts for entering the new run level, e.g. /etc/init.d/rc0.d/halt in case of a shutdown. (Note that, at the top of the halt script, RUN_PARALLEL is re-read from /etc/sysconfig/boot and therefore possibly reset to RUN_PARALLEL=yes.) The SIGSYS informs blogd that it should stop writing to disk. However, I have noticed that blogd continues to write to /var/log/boot.(o)msg for some (short) time. Typically, the last message in my /var/log/boot.omsg is "Running /etc/init.d/halt.local done". In the case that some shutdown scripts like unmounting the disks are skipped, I can see some further messages in /var/log/boot.omsg which come from startpar. It looks like this is happening in that particular case: * rc sends SIGSYS to blogd * rc calls halt * halt calls halt.local and does some other stuff * halt calls startpar * startpar calls some of the rc0.d/K* scripts * blogd receives/handles SIGSYS and stops writing to disk * my assumption is: at this point startpar exits with 141, leaving some of the rc0.d/K* scripts uncalled. The best solution for me is to set RUN_PARALLEL=no *in the halt script*. I cannot notice any negative effect on the speed for shutting down. Booting the system still uses RUN_PARALLEL=yes. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c8
--- Comment #8 from Peter Conrad
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c9
Volker Kuhlmann
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c10
--- Comment #10 from Volker Kuhlmann
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c
zj jia
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c11
Ruediger Oertel
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c12
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c13
--- Comment #13 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c14
Ruediger Oertel
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c15
--- Comment #15 from Markus Abt
Just submitted a fixed version of sysvinit/sysvinit-tools to factory which should fix both issues the race on SIGSYS and the missing signal handler in libblogger used by e.g. startpar
Does it make sense to test this on my opensuse 11.4 box (no problem if booting breakes)? If yes, where can I find it? I've tried http://download.opensuse.org/factory/repo/oss/suse/x86_64/, but sysvinit-2.88+-57.1.x86_64.rpm is dated 27-Jul-2011. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c16
--- Comment #16 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c17
Markus Abt
I'd like to suggest
http://download.opensuse.org/repositories/Base:/System/openSUSE_11.4/x86_64/
this should do the job
After 20 successful halts/reboots, I can confirm that the problem is solved on my opensuse 11.4 x86_64. startpar exits with 0 now, after calling all boot.*(shutdown) scripts. I can also confirm, that blogd now stops writing to disk after SIGSYS. (FYI, blogd seems not to close /var/log/boot.msg after that, but in boot.localfs, mkill kills blogd, therefore umount /var is also successful.) ((in one situation, umount did not happen: when going from a multi-user runlevel to runlevel 1, then runlevel 0 or 6. See bug #709825.)) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c18
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c19
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c20
--- Comment #20 from Markus Abt
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c21
--- Comment #21 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c22
--- Comment #22 from Markus Abt
IMHO this updates should be already part of your system, shouldn't it? Compare with your comment #17
Yes, that's true. But in the mean time, I have installed four other systems with openSUSE 11.4, and all four had the same issue: At reboot, file systems are fschecked, or journals are replayed, respectively. I have to remember to install the supplied update manually directly after setting up the system. Or maybe, I don't understand openSUSE's update policy, or have failed to include the correct update repositories. I guess, this may be the case with many systems, with many users not recognizing the bug due to short journal replaying times. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c23
--- Comment #23 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c24
--- Comment #24 from Markus Abt
Hmmm ... strange as I've never seen this problem, OK I've enabled automatic update within YaST and discovered onyl Firefox as a problem due the new upstream version numbering scheme, but this is not caused by SUSE ;)
You mean, you have included the Base:/System/openSUSE_11.4/ repository? I cannot help with Firefox, a lot of people are amused (more or less). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c25
--- Comment #25 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c26
Paul Goossens
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c27
--- Comment #27 from Markus Abt
https://bugzilla.novell.com/show_bug.cgi?id=679671
https://bugzilla.novell.com/show_bug.cgi?id=679671#c28
--- Comment #28 from Paul Goossens
participants (1)
-
bugzilla_noreply@novell.com