[Bug 752869] New: md raid1 doesn't boot after removing 1 disk, when the server is turned off

bugzilla_noreply＠novell.com

19 Mar 2012 19 Mar '12

15:08

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c0 Summary: md raid1 doesn't boot after removing 1 disk, when the server is turned off Classification: openSUSE Product: openSUSE 12.1 Version: Final Platform: x86-64 OS/Version: openSUSE 12.1 Status: NEW Severity: Major Priority: P5 - None Component: Other AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: wvvelzen@gmail.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- Created an attachment (id=482002) --> (http://bugzilla.novell.com/attachment.cgi?id=482002) State after booting with 1 raid disk removed User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2 I was testing a md raid 1 setup, for different fail scenarios, and came across the following problem. Given the following raid setup: # grep '/dev/md' /etc/fstab /dev/md0 swap swap defaults 0 0 /dev/md2 / ext4 noatime,data=writeback,noacl,user_xattr 1 1 /dev/md1 /boot ext4 noatime,data=writeback,noacl,user_xattr 1 2 /dev/md3 /home ext4 noatime,data=writeback,noacl,user_xattr 1 2 # cat /proc/mdstat Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] md3 : active raid1 sdb5[2] sda5[0] 248460152 blocks super 1.0 [2/2] [UU] bitmap: 0/2 pages [0KB], 65536KB chunk md1 : active raid1 sdb2[2] sda2[0] 522228 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md0 : active raid1 sdb1[2] sda1[0] 2095092 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md2 : active raid1 sdb3[2] sda3[0] 41946040 blocks super 1.0 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk unused devices: <none> When I turn off the server and remove 1 disk (the second in this case). The server doesn't properly boot, and ends up in the emergency console. I will attach a screen photograph of this situation. Switching between systemd or sysinit v with F5 on the bootscreen doesn't make a difference. Starting in 'failsafe' mode doesn't help either. Adding an empty disk instead of the removed disk or a pre-partioned disk doesn't help either. On previous versions of openSUSE (10.3) this test case worked just fine. However when I do a hot remove of 1 disk of the raid 1 array, on a running server, so the md raid knows it's degraded, and can save this state to the mdraid superblock. I can reboot just fine without any problems. Reproducible: Always Steps to Reproduce: 1. 2. 3. Expected Results: The raid1 in this state should have just booted the os. This is the purpose of having a raid1 in the first place! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

Show replies by date

bugzilla_noreply＠novell.com

21 Mar 21 Mar

05:12

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c1 kk zhang changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kkzhang@novell.com AssignedTo|bnc-team-screening@forge.pr |puzel@suse.com |ovo.novell.com | --- Comment #1 from kk zhang 2012-03-21 05:12:38 UTC --- Hi，puzel.could you please look at this?I am not sure whether it is right to assign it to you.Feel free to reassign it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

09:03

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c Petr Uzel changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|puzel@suse.com |nfbrown@suse.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

15:38

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c2 --- Comment #2 from Wilfred van Velzen 2012-03-21 15:38:39 UTC --- With the help of https://en.wikipedia.org/wiki/Mdadm, I found the right commands to (re)assemble the raid arrays in the emergency console, in this state:

...

mdadm --stop /dev/md0 mdadm --stop /dev/md2

...

mdadm --assemble /dev/md0 /dev/sda1 --run mdadm --add /dev/md0 /dev/sdb1 [repeated for all the arrays]

And wait for the synchronization to finish, before rebooting. Checking this with:

...

cat /proc/mdstat

So it isn't an unrecoverable state, but still rather inconvenient. Specially if the machine is at a co-location and you have to drive for a X amount of time to go to the location to find out what happend and fix it... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

23:40

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c3 Neil Brown changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |wvvelzen@gmail.com --- Comment #3 from Neil Brown 2012-03-21 23:40:17 UTC --- Hi. Thanks for the report. It sounds like you don't have an mdadm.conf in your initrd. Did you convert this from non-RAID to RAID without rerunning mkinitrd ?? could you: cd /tmp zcat /boot/initrd | cpio -idv and see if "/tmp/etc/mdadm.conf" gets created? If it does, please attach it. If it doesn't please mkinitrd then run the above commands again and see if an mdadm.conf is there. If it is - then try removing the second device and see if it will boot properly without it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

22 Mar 22 Mar

08:59

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c4 Wilfred van Velzen changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|wvvelzen@gmail.com | --- Comment #4 from Wilfred van Velzen 2012-03-22 08:58:59 UTC --- (In reply to comment #3)

...

It sounds like you don't have an mdadm.conf in your initrd. Did you convert this from non-RAID to RAID without rerunning mkinitrd ??

The raid setup was created during installation of opensuse...

...

could you: cd /tmp zcat /boot/initrd | cpio -idv

and see if "/tmp/etc/mdadm.conf" gets created? If it does, please attach it.

It is: AUTO -all ARRAY /dev/md2 metadata=1.0 name=linux:2 UUID=bf722ec4:89570cf1:4dd162d9:8c34c87b ARRAY /dev/md0 metadata=1.0 name=linux:0 UUID=fd8ef37a:564a9254:72876573:b839300c It's different from the one in /etc though: DEVICE containers partitions ARRAY /dev/md0 UUID=fd8ef37a:564a9254:72876573:b839300c ARRAY /dev/md1 UUID=ada7fa3a:9242009a:cd9f63b7:a3d1b3d3 ARRAY /dev/md2 UUID=bf722ec4:89570cf1:4dd162d9:8c34c87b ARRAY /dev/md3 UUID=a938e1fc:3a39681c:8f8d2c1b:789e6370

...

If it doesn't please mkinitrd then run the above commands again and see if an mdadm.conf is there.

I did this anyway, but the contents of /tmp/etc/mdadm.conf, remain the same. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

17 Apr 17 Apr

00:25

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c5 Neil Brown changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |wvvelzen@gmail.com --- Comment #5 from Neil Brown 2012-04-17 00:25:42 UTC --- Thanks. I think the problem might be that /lib/udev/rules.d/64-md-raid.rules in the initrd is causing problems. To confirm this I'd like you to create an initrd without this file and see what happens. I think the easiest might be to edit /lib/mkinitrd/scripts/setup-udev.sh (after taking a copy just to be safe) and remove the line: 64-md-raid.rules \ and the run mkinitrd again. Could you try that and let me know the result? thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

07:59

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c6 Wilfred van Velzen changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|wvvelzen@gmail.com | --- Comment #6 from Wilfred van Velzen 2012-04-17 07:59:48 UTC --- (In reply to comment #5)

...

Thanks.

I think the problem might be that /lib/udev/rules.d/64-md-raid.rules

in the initrd is causing problems.

This file doesn't exist on the system: nnn:/etc/udev/rules.d # ls -l total 252 -rw-r--r-- 1 root root 571 Oct 29 19:37 51-lirc.rules -rw-r--r-- 1 root root 13890 Oct 30 07:23 55-hpmud.rules -rw-r--r-- 1 root root 161732 Mar 22 18:11 55-libsane.rules -rw-r--r-- 1 root root 692 Oct 30 07:23 56-hpmud_support.rules -rw-r--r-- 1 root root 46551 Mar 22 18:11 56-sane-backends-autoconfig.rules -rw-r--r-- 1 root root 2016 Oct 29 20:13 70-kpartx.rules -rw-r--r-- 1 root root 532 Dec 15 15:15 70-persistent-cd.rules -rw-r--r-- 1 root root 948 Mar 28 11:09 70-persistent-net.rules -rw-r--r-- 1 root root 182 Oct 29 20:13 71-multipath.rules -rw-r--r-- 1 root root 93 Oct 29 20:09 99-iwlwifi-led.rules

...

To confirm this I'd like you to create an initrd without this file and see what happens.

I think the easiest might be to edit /lib/mkinitrd/scripts/setup-udev.sh (after taking a copy just to be safe) and remove the line:

64-md-raid.rules \

and the run mkinitrd again.

Could you try that and let me know the result?

So this file isn't included in mkinitrd any way...? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

08:18

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c7 Neil Brown changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |wvvelzen@gmail.com --- Comment #7 from Neil Brown 2012-04-17 08:18:45 UTC --- Look in /lib/udev/rules.d, not /etc/udev/rules.d (confusing, isn't it... don't look in /run/udev/rules.d whatever you do :-) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

09:03

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c8 Wilfred van Velzen changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|wvvelzen@gmail.com | --- Comment #8 from Wilfred van Velzen 2012-04-17 09:03:04 UTC --- (In reply to comment #7)

...

Look in /lib/udev/rules.d, not /etc/udev/rules.d

Ah, sorry, I should have looked more careful...

...

(confusing, isn't it... don't look in /run/udev/rules.d whatever you do :-)

It's empty? Anyway... I edited /lib/mkinitrd/scripts/setup-udev.sh, ran mkinitrd. And checked with 'zcat /boot/initrd | cpio -idv', what was in '/tmp/etc/mdadm.conf'. It still contains just the two ARRAY lines for /dev/md2 and /dev/md0, as shown before. Regardless I tried the cold-remove-one-raid-disk test. But I still get to the emergency console, like before, when I start the server in this state. After this when I reboot with the two disks. It boots but the array is in degraded mode: :/proc # cat mdstat Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid1 sda2[0] sdb2[2] 522228 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md3 : active raid1 sda5[0] sdb5[2] 248460152 blocks super 1.0 [2/2] [UU] bitmap: 1/2 pages [4KB], 65536KB chunk md2 : active raid1 sda3[0] 41946040 blocks super 1.0 [2/1] [U_] bitmap: 1/1 pages [4KB], 65536KB chunk md0 : active (auto-read-only) raid1 sda1[0] sdb1[2] 2095092 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk unused devices: <none> I don't think this happened before, but this is easily fixed, with some mdadm commands... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

3 May 3 May

04:39

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c9 Neil Brown changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |wvvelzen@gmail.com --- Comment #9 from Neil Brown 2012-05-03 04:39:01 UTC --- And you checked that "64-md-raid.rules" wasn't in the initrd in any directory? Very strange. Would it be possible for me to get a copy of your initrd? It might be too big to attach, in which case you need to find somewhere to upload it that I could grab it from. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

11 May 11 May

11:05

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c10 Wilfred van Velzen changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium Status|NEEDINFO |NEW InfoProvider|wvvelzen@gmail.com | --- Comment #10 from Wilfred van Velzen 2012-05-11 11:05:28 UTC --- (In reply to comment #9)

...

And you checked that "64-md-raid.rules" wasn't in the initrd in any directory?

Yep.

...

Very strange.

Would it be possible for me to get a copy of your initrd? It might be too big to attach, in which case you need to find somewhere to upload it that I could grab it from.

http://www.sercom.nl/tmp/initrd.zip -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

11:10

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c11 --- Comment #11 from Wilfred van Velzen 2012-05-11 11:10:19 UTC --- Btw: This is the one after the change you suggested to /lib/mkinitrd/scripts/setup-udev.sh -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

21 May 21 May

04:37

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c12 Neil Brown changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |wvvelzen@gmail.com --- Comment #12 from Neil Brown 2012-05-21 04:37:22 UTC --- Thanks... still cannot find any smoking guns. I wonder if the fact that md2 turned up degraded after your last attempt is significant. It could suggest that removing 64-md-raid.rules did make an important difference, but then some other problem interfered with successful boot. It certainly does seem to imply the array array was assembled with only one device and then accessed. Could you try again with one disk missing and get another photo of the emergency console screen? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

13:07

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c13 --- Comment #13 from Wilfred van Velzen 2012-05-21 13:07:03 UTC --- Created an attachment (id=491738) --> (http://bugzilla.novell.com/attachment.cgi?id=491738) Photo of screen in emergency mode -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

13:19

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c14 Wilfred van Velzen changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|wvvelzen@gmail.com | --- Comment #14 from Wilfred van Velzen 2012-05-21 13:19:15 UTC --- (In reply to comment #12)

...

Could you try again with one disk missing and get another photo of the emergency console screen?

I did a 'zypper up', did a reboot, and made sure the state of the array was ok. It was: /proc # cat mdstat Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md3 : active raid1 sda5[0] sdb5[2] 248460152 blocks super 1.0 [2/2] [UU] bitmap: 0/2 pages [0KB], 65536KB chunk md1 : active raid1 sda2[0] sdb2[2] 522228 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md2 : active raid1 sda3[0] sdb3[2] 41946040 blocks super 1.0 [2/2] [UU] bitmap: 1/1 pages [4KB], 65536KB chunk md0 : active (auto-read-only) raid1 sda1[0] sdb1[2] 2095092 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk unused devices: <none> (Except for the auto-read-only state of the raid that contains the swap partition. But I think this is a different issue) After this I did the test with 1 disk removed. This got me to the emergency mode again (see the new attachement). After adding the disk and rebooting the state of the array again was degraded again, like before, for the root partion: /proc # cat mdstat Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid1 sda2[0] sdb2[2] 522228 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md3 : active raid1 sda5[0] sdb5[2] 248460152 blocks super 1.0 [2/2] [UU] bitmap: 0/2 pages [0KB], 65536KB chunk md2 : active raid1 sda3[0] 41946040 blocks super 1.0 [2/1] [U_] bitmap: 1/1 pages [4KB], 65536KB chunk md0 : active (auto-read-only) raid1 sda1[0] sdb1[2] 2095092 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

23:25

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c15 Neil Brown changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #15 from Neil Brown 2012-05-21 23:25:29 UTC --- BINGO - thanks. It is now stopping in a very different place in the boot sequence. Removing the 64-md-raid.rules from the initrd removed one problem and now you are hitting another one. If you remove the same file from the root filesystem it will solve that one too and you will be able to boot after removing one device. That isn't a long term solution as there is purpose in those files, but it will get you to a situation where you can be sure of a successful boot. Now I have to figure out what the *right* solution is. I'll give it some thought. The "auto-read-only" state for the swap device is perfectly normal. As soon as something swaps and tried to write it will instantly disappear. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

22 May 22 May

04:57

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c16 Neil Brown changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |wvvelzen@gmail.com --- Comment #16 from Neil Brown 2012-05-22 04:57:34 UTC --- Could you please install the appropriate rpm from here: https://build.opensuse.org/package/binaries?package=mdadm&project=home%3Aneilbrown%3Abranches%3AopenSUSE%3A12.1%3AUpdate&repository=standard revert the change you made to /lib/mkinitrd/scripts/setup-udev.sh run "mkinitrd", and then see if the problem is fixes? I've added calls to "mdadm -Irs" at appropriate points. Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

23 May 23 May

11:46

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c17 --- Comment #17 from Wilfred van Velzen 2012-05-23 11:45:59 UTC --- (In reply to comment #16)

...

Could you please install the appropriate rpm from here:

https://build.opensuse.org/package/binaries?package=mdadm&project=home%3Aneilbrown%3Abranches%3AopenSUSE%3A12.1%3AUpdate&repository=standard

Could you provide a (more) direct download link? The "Go to download repository" link on that page gives a 404 error. And regular opensuse package search, doesn't show your repository either...? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

12:17

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c18 --- Comment #18 from Neil Brown 2012-05-23 12:16:59 UTC --- Created an attachment (id=492101) --> (http://bugzilla.novell.com/attachment.cgi?id=492101) rpm to test Sorry, I thought it was public.. I wonder how I make it public. Anyway, hopefully attaching it here will work. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

14:12

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c19 Wilfred van Velzen changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|wvvelzen@gmail.com | --- Comment #19 from Wilfred van Velzen 2012-05-23 14:12:48 UTC --- Created an attachment (id=492117) --> (http://bugzilla.novell.com/attachment.cgi?id=492117) Next screen shot I did as you asked. But I still get to the emergency console after removing the disk. (See attachment) After adding the disk the server boots, but the state of the array is again different (but easily fixable again): /proc # cat mdstat Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] md3 : active raid1 sda5[0] 248460152 blocks super 1.0 [2/1] [U_] bitmap: 1/2 pages [4KB], 65536KB chunk md1 : active raid1 sda2[0] sdb2[2] 522228 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md2 : active raid1 sda3[0] 41946040 blocks super 1.0 [2/1] [U_] bitmap: 1/1 pages [4KB], 65536KB chunk md0 : active raid1 sdb1[2] sda1[0] 2095092 blocks super 1.0 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

18:12

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c20 Christian Boltz changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |suse-beta@cboltz.de --- Comment #20 from Christian Boltz 2012-05-23 20:12:04 CEST --- (In reply to comment #18)

...

Sorry, I thought it was public.. I wonder how I make it public.

AFAIK, home:*:branches:* have the publish flag disabled by default. Enable it in the project config - or just use osc getbinaries to download the RPMs. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

24 May 24 May

01:24

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c21 Neil Brown changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |wvvelzen@gmail.com --- Comment #21 from Neil Brown 2012-05-24 01:24:12 UTC --- I did find and set the publish flag and it didn't seem to do anything. It has now though, so that above link will work - but is no longer needed. Thanks for testing and the screen shot. I think we are getting there. I suspect that if you switched back to sysvinit now it would work. The remaining problem is an interaction with systemd. This can be fixed by editing /lib/udev/rules.d/64-md-raid.rules and changing -ATTR{md/array_state}=="|clear|inactive", GOTO="md_end" to +ATTR{md/array_state}=="|clear|inactive", ENV{SYSTEMD_READY}="0", GOTO="md_end" i.e. tell systemd that the md array is not ready until it is fully assembled. If you like you can copy the rules file from /lib/..... to /etc/.... with the same path and edit it there. As long as the file in /etc exists, the matching file in /lib will be ignored. When you finish testing you can just remove the file from /etc to go back to how things were before. If this works as expected I'll ask a maintenance update with these fixes. Thanks for your help in isolating this (and sorry that it has taken 2 month!). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

12:16

New subject: [Bug 752869] md raid1 doesn't boot after removing 1 disk, when the server is turned off

https://bugzilla.novell.com/show_bug.cgi?id=752869 https://bugzilla.novell.com/show_bug.cgi?id=752869#c22 Wilfred van Velzen changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|wvvelzen@gmail.com | --- Comment #22 from Wilfred van Velzen 2012-05-24 12:16:51 UTC --- (In reply to comment #21)

...

Thanks for testing and the screen shot. I think we are getting there.

I suspect that if you switched back to sysvinit now it would work.

I didn't try that.

...

The remaining problem is an interaction with systemd.

...

[...]

If this works as expected I'll ask a maintenance update with these fixes.

This worked as expected. The server now booted fine with the one disk removed!

...

Thanks for your help in isolating this (and sorry that it has taken 2 month!).

No problem, I'm used to much worse response time in bugzilla. :-/ This wasn't even a bug you will notice during normal operation. And I'm happy to help improve openSUSE! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

bugzilla_noreply＠novell.com

5 Jun 5 Jun