[Bug 851722] New: grub2's os-prober reports EMERGENCY corrupt xfs filesystem on extended partition
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c0 Summary: grub2's os-prober reports EMERGENCY corrupt xfs filesystem on extended partition Classification: openSUSE Product: openSUSE 13.1 Version: Final Platform: x86-64 OS/Version: openSUSE 13.1 Status: NEW Severity: Normal Priority: P5 - None Component: Bootloader AssignedTo: jsrain@suse.com ReportedBy: jimc@math.ucla.edu QAContact: jsrain@suse.com Found By: --- Blocker: --- This is for os-prober-1.61-3.1.2.x86_64 required by grub2-2.00-39.1.3.x86_64 . I upgraded from OpenSuSE-12.3 to 13.1 final. Shortly after reboot, and again 25 minutes later, something unknown execed os-prober which reports in syslog probing all partitions, mostly uneventfully except for syslog reports of negative findings (I have a priority=debug log). /dev/sda4 is the extended partition on this disc. When 50mounted-tests tries to mount it (without specifying the filesystem type), most filesystem modules are not too verbose when they fail to mount /dev/sda4. However, when mounting is attempted as xfs, it spews out a full call trace and a lot more, plus a report to all logged-in users, presumably at priority = emergency. I haven't seen this before: maybe os-prober is new in 13.1-final (I didn't see the behavior in 13.1-RC1), or maybe this particular extended partition sets off the bug. Sooner or later, someone is going to have a partition containing garbage which looks like a real filesystem -- perhaps even a real filesystem that got trashed, which you're trying to do forensics on. If the filesystem module fails to reject a corrupt filesystem, or even one infested with some kind of virus (think Windows), you could have a really bad situation. What I would like the developers to do: I know I'm not going to get everything I'm asking for here, so I've put the more practical items first. Desist with os-prober. It's too dangerous and too noisy. Whatever ran it twice some time after reboot should not be doing that kind of thing autonomously. Identify it and kill it. At least make its operation configurable. I don't see any relevant unit files that could be disabled. Add a sysconfig parameter, probably in /etc/sysconfig/bootloader, which tells grub2-install to run os-prober just once, to detect dual-booting. Alter the logic of os-prober to exclude partitions that are already mounted, and those with implausible filesystem types like 0x0f (extended). But you might mount your Windows root partition in Linux; excluding such things should be configurable. os-prober should use the "file -s" command and should require it to definitively identify the filesystem type; unknown or implausible types (like swap) should be excluded. It should check the filesystem (readonly) and only then should it attempt to mount it (specifying the type) and then try to recognize a root partition of an alien operating system. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c Jiri Srain <jsrain@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|jsrain@suse.com |mchang@suse.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c1 --- Comment #1 from James Carter <jimc@math.ucla.edu> 2013-12-05 04:58:43 UTC --- Every machine upgraded to OpenSuSE-13.1 (final) gets one or two XFS corruption reports every time it boots. I found where to configure os-prober: in /etc/default/grub change to GRUB_DISABLE_OS_PROBER=true . That takes care of it. But I never found what is autonomously rebuilding /boot/grub2/grub.cfg . -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c2 Michael Chang <mchang@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #2 from Michael Chang <mchang@suse.com> 2013-12-05 06:23:09 UTC --- (In reply to comment #0)
However, when mounting is attempted as xfs, it spews out a full call trace and a lot more, plus a report to all logged-in users, presumably at priority = emergency. I haven't seen this before: maybe os-prober is new in 13.1-final (I didn't see the behavior in 13.1-RC1), or maybe this particular extended partition sets off the bug.
Did you have the error in this thread? http://oss.sgi.com/archives/xfs/2013-05/msg00148.html -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c3 --- Comment #3 from Michael Chang <mchang@suse.com> 2013-12-05 06:26:48 UTC --- (In reply to comment #1)
But I never found what is autonomously rebuilding /boot/grub2/grub.cfg .
In yast2 bootloader there's a checkbox to allow you to disable os-prober. Or you should run either "update-bootloader --refresh" or "grub2-mkconfig -o /boot/grub2/grub.cfg" to make your changes in /etc/default/grub effective. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c4 --- Comment #4 from Michael Chang <mchang@suse.com> 2013-12-05 10:38:39 UTC --- Created an attachment (id=570368) --> (http://bugzilla.novell.com/attachment.cgi?id=570368) not trying to detect partition without fs-uuid return false fs type result if partition without any file system uuid on it. this could fix os-prober trying to detect it by mount tests with various kinds of file system which's not only waste of time, but also could result in errors on certain file system modules -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c5 Michael Chang <mchang@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |jimc@math.ucla.edu --- Comment #5 from Michael Chang <mchang@suse.com> 2013-12-05 10:44:07 UTC --- Hi James, Could you test the package in this URL ? http://download.opensuse.org/repositories/home:/michael-chang:/12.3:/bnc:/85... fyi, the commands could be (I didn't verify it). $ zypper ar --repo http://download.opensuse.org/repositories/home:/michael-chang:/12.3:/bnc:/85... $ zypper ref $ zypper dup -r home_michael-chang_12.3_bnc_851722 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c6 --- Comment #6 from Michael Chang <mchang@suse.com> 2013-12-06 09:38:06 UTC --- I have a new fix. But it's the same URL so please follow above step to test. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c7 --- Comment #7 from Michael Chang <mchang@suse.com> 2013-12-06 09:41:16 UTC --- Created an attachment (id=570566) --> (http://bugzilla.novell.com/attachment.cgi?id=570566) don't modprobe all file system modules and don't test mount on unknown partitions * don't modprobe that many listed kernel file system modules as linux mount can automatically do that * don't test mount on partitions without any known file system detected -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c8 James Carter <jimc@math.ucla.edu> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|jimc@math.ucla.edu | --- Comment #8 from James Carter <jimc@math.ucla.edu> 2013-12-07 05:46:45 UTC --- Starting with the standard os-prober-1.61-3.1.2.x86_64 and /etc/default/grub saying GRUB_DISABLE_OS_PROBER=false. If I do grub2-mkconfig -o /boot/grub2/grub.cfg syslog shows the volcanic eruption from XFS. I attached the complete performance, which I should have done the first time around. It's the same as in the oss.sgi.com bug report, except slight differences in the call trace because they have kernel 3.7.x while we have 3.11.6. If GRUB_DISABLE_OS_PROBER=true, nothing at all is seen in syslog, as expected. I installed os-prober-1.61-52.1.x86_64.rpm from your repo (and btrfsprogs, snapper, and its dependencies). With GRUB_DISABLE_OS_PROBER=false I did grub2-mkconfig -o /boot/grub2/grub.cfg and the debug messages indicated a much more sane logic in os-prober, and in particular, no XFS involvement at all, and /dev/sda4 (the extended partition) is rejected out of hand. Thank you for this improvement; I'll install it on my other machines. I look forward to seeing it in an official update. Someone with a known corrupt filesystem or with other reasons to not mount it, should be sure to set GRUB_DISABLE_OS_PROBER=true. I still haven't figured out what would have execed grub2-mkconfig so long after booting. Maybe purge-kernels.service, if it does purge a kernel. It has IOSchedulingClass=idle which could have made it start late. It would be nice to only install btrfsprogs and snapper-zypp-plugin if the machine actually had (or gained in the future) a BTRFS partition, but I can't imagine how to do a dynamic dependency of this kind. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c9 --- Comment #9 from James Carter <jimc@math.ucla.edu> 2013-12-07 05:49:27 UTC --- Created an attachment (id=570713) --> (http://bugzilla.novell.com/attachment.cgi?id=570713) Syslog when XFS mounts a totally non-XFS partition -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c10 --- Comment #10 from Michael Chang <mchang@suse.com> 2013-12-09 07:32:05 UTC --- (In reply to comment #8)
I still haven't figured out what would have execed grub2-mkconfig so long after booting. Maybe purge-kernels.service, if it does purge a kernel. It has IOSchedulingClass=idle which could have made it start late.
Kernel package's %post (or %postun) scrip-let will call "update-bootloader --refresh" to update bootloader config with new kernel entries. In grub2 it will call grub2-mkconfig and in turn calls os-prober so it's probably the case.
It would be nice to only install btrfsprogs and snapper-zypp-plugin if the machine actually had (or gained in the future) a BTRFS partition, but I can't imagine how to do a dynamic dependency of this kind.
os-prober should only require btrfsprogs and no dependency to snapper(-zypp-plugin) at least the result of `rpm -q --requires os-prober` doesn't seem to be without snapper .. The btrfsprogs is used by os-prober to detect distributions on Btrfs subvolumes (not necessary in "/" as any subvol could set as default root tree..). Yes there's really no such dynamic dependency set by foreign partitions use Btrfs or not. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c11 --- Comment #11 from Michael Chang <mchang@suse.com> 2013-12-10 07:14:30 UTC --- Submitted as SRID#210120. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851722 https://bugzilla.novell.com/show_bug.cgi?id=851722#c12 --- Comment #12 from Bernhard Wiedemann <bwiedemann@suse.com> 2013-12-10 11:00:55 CET --- This is an autogenerated message for OBS integration: This bug (851722) was mentioned in https://build.opensuse.org/request/show/210325 Factory / os-prober -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com