[Bug 656536] New: booting on some sw-RAID setups does not assemble root device in initrd
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c0 Summary: booting on some sw-RAID setups does not assemble root device in initrd Classification: openSUSE Product: openSUSE 11.4 Version: Factory Platform: All OS/Version: SuSE Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem AssignedTo: mmarek@novell.com ReportedBy: bwiedemann@novell.com QAContact: qa@suse.de Found By: --- Blocker: --- my automated testing scripts for openSUSE can setup and boot software RAID which works for LiveCD installs such as http://www3.zq1.de/opensuse/video/openSUSE-KDE-LiveCD-x86_64-Build0681-RAID.... and http://openqa.opensuse.org/opensuse/video/openSUSE-KDE-LiveCD-x86_64-Build09... but failed for 11.3-DVD and current Factory DVD/NET installs in http://openqa.opensuse.org/opensuse/video/openSUSE-DVD-x86_64-Build0909-RAID... at 01:11 you can see the intended RAID setup with / being /dev/md0 with /dev/vd[abcd]2 as underlying block-devices but at 02:14 you can see that in initrd only a /dev/md0 is assembled from /dev/vd[abcd]3 which then can not be used to boot the system, because it is the swap. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c1 Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |nfbrown@novell.com --- Comment #1 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-01-19 11:56:31 CET --- Neil, do you have an idea what could go wrong with this RAID install? it is Reproducible: Always it is possible to workaround this, by running in initrd mdadm --stop /dev/md0 mdadm --assemble /dev/md0 /dev/vda2 /dev/vdb2 /dev/vdc2 /dev/vdd2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c2 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |nfbrown@novell.com InfoProvider|nfbrown@novell.com |hare@novell.com --- Comment #2 from Neil Brown <nfbrown@novell.com> 2011-02-07 01:53:02 UTC --- I'm sorry but I have no idea what could be going wrong here. Presumably mdadm.conf is getting the wrong content, and presumably that is created by mkinitrd-setup.sh in the mdadm product. But I don't know much about initrd setup. Is there any way to get tracing info from mkinitrd ??? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c3 Hannes Reinecke <hare@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW CC| |hare@novell.com InfoProvider|hare@novell.com | --- Comment #3 from Hannes Reinecke <hare@novell.com> 2011-02-09 09:12:15 UTC --- yes, just add 'linuxrc=trace' on the kernel commandline. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c Andreas Jaeger <aj@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |aj@novell.com Severity|Normal |Major -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c4 --- Comment #4 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-03-01 12:13:55 CET --- Created an attachment (id=416748) --> (http://bugzilla.novell.com/attachment.cgi?id=416748) yast2 logs from RAID10 install -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c5 Andreas Jaeger <aj@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mmarek@novell.com AssignedTo|mmarek@novell.com |kernel-maintainers@forge.pr | |ovo.novell.com --- Comment #5 from Andreas Jaeger <aj@novell.com> 2011-03-01 11:55:31 UTC --- Michal is on FTO, anybody else that can look at this? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c6 --- Comment #6 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-03-01 13:23:05 CET --- Created an attachment (id=416775) --> (http://bugzilla.novell.com/attachment.cgi?id=416775) MD and other system info typescript -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c7 Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kasievers@novell.com --- Comment #7 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-03-01 19:17:39 CET --- I verified that mdadm.conf has the correct content both in initrd and on rootfs. http://openqa.opensuse.org/opensuse/permanent/bug/bnc656536-1.png shows udev starting, then md0 being started with (wrong) 4 devs and then mdadm complaining that /dev/md0 is already in use. This is probably when it tries to assemble the correct RAID for the rootfs. There are some udev rules about MD, could those cause trouble here? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c8 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |bwiedemann@novell.com --- Comment #8 from Neil Brown <nfbrown@novell.com> 2011-03-01 20:26:39 UTC --- The mdadm boot script in the initrd tries to assemble 'resumedev' and then 'rootdev'. So the message: mdadm: /dev/md0 has been started with 4 drives. should be for resumedev. It appears that it is picking the right drives, but the wrong name (/dev/md0 rather than /dev/md1). That means the command $mdadm -A $mdconf --uuid=$uuid "$dev" must have the correct $uuid for the resumedev, but the wrong $dev. The only way I can see it getting /dev/md0 for $dev is if the immediately preceeding: if test -z "$dev"; then # fallback dev=/dev/md0 fi set dev, so: dev=$(get_md_name "$uuid") must leave dev as empty. I can only see that happening if /etc/mdadm.conf doesn't exist in the initrd. Yet you say that it does....and the setup script certainly seems to create it... however it will definitely look different from what it on the root filesystem. Can you please report exactly the /etc/mdadm.conf from the initrd? Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c9 Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|bwiedemann@novell.com | --- Comment #9 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-03-01 22:15:52 CET --- initrd's /etc/mdadm.conf just has ARRAY /dev/md0 metadata=1.0 name=linux:0 UUID=38bc237c:784e1716:c9203720:4f48dd1f which is the entry for md0 - the device containing the rootfs Would this mean that get_md_name would not find the swap's ID and set dev="" which would make it default to /dev/md0, blocking the real root dev to be activated as md0? I get working RAID10 installs from LiveCD and found that it contains nearly the same in initrd's mdadm.conf: ARRAY /dev/md0 metadata=1.0 name=linux.site:0 UUID=... just the hostname string differs. And /proc/cmdline has root=/dev/md0 resume=/dev/md2 there. I think, that makes it work on LiveCD installs. This means, this bug always happens on systems installed from DVD/NET that have both swap and rootfs on software-RAID - which is common if you do RAID. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c10 Neil Brown <nfbrown@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |hare@novell.com --- Comment #10 from Neil Brown <nfbrown@novell.com> 2011-03-01 22:13:00 UTC --- Thanks. There seem to be (at least) two bugs here. 1/ when run from the installer, mkinitrd doesn't know about a resumedev and so doesn't tell the various scripts about one. You can confirm this by looking in y2logmkinitrd in the YaST logs. It mentioned "Root device:" but not "Resume device:". 2/ When given a resume device like /dev/disk/by-id/md-uuid-xxxxxxx the mdadm boot script behaves poorly if that uuid is not found in mdadm.conf and assumes /dev/md0. It really doesn't need to assume anything at all. The important difference between the LiveCD and the DVD/NET installs are that the LiveCD sets resume=/dev/md2 while the DVD/NET install sets resume=/dev/disk/by-id/md-uuid-xxxxxxxx I can fix '1' by changing: if test -z "$dev"; then # fallback dev=/dev/md0 fi $mdadm -A $mdconf --uuid=$uuid "$dev" to if test -z "$dev"; then $mdadm -A $mdconf --uuid=$uuid else $mdadm -A $mdconf --uuid=$uuid "$dev" fi The others need to be handled by someone who understands mkinitrd and installers... Hannes: can you make any comment on '2'? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c11 --- Comment #11 from Neil Brown <nfbrown@novell.com> 2011-03-01 22:18:17 UTC --- mdadm change submitted to Factory (id 63188) and openSUSE:11.4 (id 63189) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c12 --- Comment #12 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-03-04 12:18:40 CET --- The mdadm fix for bug #2 made 11.4 working (on fresh install in openQA tests) Fixing bug #1 would still be good, though. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c13 Hannes Reinecke <hare@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|hare@novell.com | AssignedTo|kernel-maintainers@forge.pr |mmarek@novell.com |ovo.novell.com | --- Comment #13 from Hannes Reinecke <hare@novell.com> 2011-03-11 08:53:09 UTC --- Indeed. Passing on to the mkinitrd maintainer. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=656536 https://bugzilla.novell.com/show_bug.cgi?id=656536#c14 Michal Marek <mmarek@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WONTFIX --- Comment #14 from Michal Marek <mmarek@suse.com> 2013-09-30 08:09:27 UTC --- openSUSE <= 12.1 is no longer active. If you still can reproduce the problem with openSUSE 12.3 or Factory, please reopen the bug and change the product field accordingly. Sorry that I did not have time to address this bug. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com