[Bug 242047] New: boot.rootfsck sometimes fails to obey /forcefsck
https://bugzilla.novell.com/show_bug.cgi?id=242047 Summary: boot.rootfsck sometimes fails to obey /forcefsck Product: SUSE Linux 10.1 Version: Final Platform: i586 OS/Version: SuSE Linux 10.1 Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: jimc@math.ucla.edu QAContact: qa@suse.de When the /forcefsck file is present, /etc/init.d/boot.rootfsck and boot.localfs are supposed to use the -f switch when checking filesystems, causing a complete check to be performed. But to do this, boot.rootfsck has to remount the root filesystem readonly. Intermittently the remount fails (and therefore the check does not happen) because something has files open for writing in the root. It turns out that the culprit is modprobe. /sbin/udevsettle (in boot.udev) waits until no further events need to be dispatched, but the last event(s) may still be in progress (driver being loaded) when it exits. The *real* fix would be in udevsettle, but I've attached a simple patch that retries the remount several times -- up to 10 seconds, though I've seen at most 2 seconds delay needed. The patch also includes code to recognize a boot parameter of "forcefsck", and for completeness I've also included the corresponding patch for boot.localfs. This is in the nature of a feature improvement, but I've found it to be very convenient in system administration. You can include or toss this part according to policy. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242047 ------- Comment #1 from jimc@math.ucla.edu 2007-02-04 14:38 MST ------- Created an attachment (id=117265) --> (https://bugzilla.novell.com/attachment.cgi?id=117265&action=view) Patch to retry remount -ro -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242047 chrubis@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team- |ro@novell.com |screening@forge.provo.novell| |.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242047 ro@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |werner@novell.com ------- Comment #2 from ro@novell.com 2007-03-01 10:28 MST ------- werner: some of this was your code, could you review please ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242047 jimc@math.ucla.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|werner@novell.com | ------- Comment #3 from jimc@math.ucla.edu 2007-03-01 22:22 MST ------- Yes, the diffs in the patch are my code and have not been corrupted in transit. I think Novell/OpenSuSE need to decide which parts to take into the mainline boot scripts. Of course jimc thinks all the changes are useful -- I very much like being able to power up a desktop machine and tell it "forcefsck" on the kernel command line, rather than booting it, creating the /forcefsck file, and then rebooting it. Of particular interest is using fuser to report which processes prevented remounting after 10 secs. The information is very useful, but fuser is fragile and on a flaky machine it potentially could hang and prevent booting. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242047 werner@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hare@novell.com Status|NEW |NEEDINFO Info Provider| |jimc@math.ucla.edu ------- Comment #4 from werner@novell.com 2007-03-02 03:01 MST ------- Hannes? IMHO the root file system is checked within initrd. In other words the forced root file system check should be done therein with the help of the keyword "forcefsck" in the kernels command line. As long as there are processes which access the root file system like udev during boot and as long as there exists root file systems which should not mounted even read only during file system check the stuff has IMHO to be handled in initrd. @James: one problem with fuser from psmisc-22.3 is the stat(2) calls which may hang on file systems which are e.g. stalled due not responding remote file system servers (NFS, ISCSI, ...). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242047 jimc@math.ucla.edu changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|jimc@math.ucla.edu | ------- Comment #5 from jimc@math.ucla.edu 2007-03-06 12:12 MST ------- @Werner: As the author of a lot of the boot scripts you speak from experience here. Starting in SuSE 9.3 or 10.0 fsck appeared in the initrd (and I put my forcefsck hack in as well). But unfortunately the initrd can't know whether the /forcefsck file exists until after it has run fsck and then mounted the root, so the root wouldn't get a full fsck until its timeout or mount count expires, except on my system where I can use my command line arg. Some sysops think it's important to forcefsck on a regular schedule, to avoid a random long delay in booting, which of course would only happen when a customer particularly wants service fast. Here's a compromise: initrd does a nonforced fsck and mounts the root. Then it checks for /forcefsck [jimc says: also checks for commandline forcefsck arg] and if found, it UNmounts the root, runs fsck -f, and remounts. However, other sysops either use only builtin drivers in the stock kernel or they compile their own kernel with their favorite drivers hardwired; this is out of security paranoia, for the ones whose postings I've read. Then no initrd would be necessary or desired, and that means /etc/init.d/boot.rootfsck still has to exist (and be sensitive to /forcefsck). It's kind of a fragile kludge to make a rule that when boot.rootfsck runs, the root has to be remountable readonly, but not too much occurs ahead of it, and I don't see any better solution. Of course the *real* solution for this problem is if /sbin/udevsettle actually worked. It exits when there are no queued hotplug events waiting for udev to execute them; it *should* exit when the last event has been completely finished, not when it is started. Re. your comment on fuser: on my work systems (125 boxes running SuSE 10.1) we often have the problem of fuser getting stuck when there's a stale NFS mount. So we do a timeout thing: do the fuser step in a subshell in the background in parallel with a "sleep", and whichever finishes first, murder the other. (This is in admin cleanup scripts, not for booting.) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242047 ro@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=242047 User werner@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=242047#c6 Dr. Werner Fink <werner@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |werner@novell.com Status|ASSIGNED |RESOLVED Resolution| |DUPLICATE --- Comment #6 from Dr. Werner Fink <werner@novell.com> 2008-09-02 04:31:42 MDT --- *** This bug has been marked as a duplicate of bug 379597 *** https://bugzilla.novell.com/show_bug.cgi?id=379597 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com