[Bug 230733] New: raid md devices may get different minor numbers under 10.2
https://bugzilla.novell.com/show_bug.cgi?id=230733 Summary: raid md devices may get different minor numbers under 10.2 Product: openSUSE 10.2 Version: Final Platform: i586 OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: walter.haidinger@gmx.at QAContact: qa@suse.de Under 10.2 (rescue system too) some of my md devies were assigned different minor numbers than under 10.1 and earlier despite configured in /etc/mdadm.conf. It seems that the minor number stored in the md superblock is preferred now (use mdadm --misc --detail /dev/mdX to list it). The fix is to stop the array and assemble it manually once with updating the super-block with '--update=super-minor'. See mdadm(8) for details. e.g.: Update a raid-1 mirror /dev/md5 of hda[ac]5 and set the preferred minor to 5: # mdadm --manage --stop /dev/md5 # mdadm --assemble /dev/md5 --update=super-minor /dev/hda5 /dev/hdc5 Yes, this may not be really a bug and is perhaps not a SuSE bug either (maybe because of new kernel 2.6.18?) but I think it is worth a note and this entry may help someone resolving the problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 chrubis@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team- |mmarek@novell.com |screening@forge.provo.novell| |.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 mmarek@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 mmarek@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |walter.haidinger@gmx.at ------- Comment #1 from mmarek@novell.com 2007-01-03 07:35 MST ------- What's the contents of /etc/mdadm.conf on 10.2, how are the md devices supposed to be numbered and how does/did the actual numbering differ? I don't see any differences among 10.0 - 10.2 with identical /etc/mdadm.conf. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #2 from walter.haidinger@gmx.at 2007-01-03 14:30 MST ------- I first noticed the change when booting into the rescue system without /etc/mdadm.conf. The md devices were assembled differently than under 10.1. I did _not_ change my mdadm.conf during the upgrade from 10.1 to 10.2. My raid config is as follows: md0 : active raid1 hde1[1] hda1[0] md1 : active raid1 hdg1[1] hdc1[0] md10 : active raid5 hdg5[3] hde5[2] hdc5[0] hda5[1] md11 : active raid5 hdg6[3] hde6[2] hdc6[0] hda6[1] md12 : active raid5 hdg7[3] hde7[2] hdc7[0] hda7[1] md13 : active raid5 hdg8[3] hde8[2] hdc8[0] hda8[1] md14 : active raid5 hdg9[3] hde9[2] hdc9[1] hda9[0] /etc/mdadm.conf DEVICE /dev/hd[aceg]* ARRAY /dev/md0 level=raid1 num-devices=2 devices=/dev/hd[ae]1 UUID=e6679ec5:2441c872:ed53428c:c96ac811 ARRAY /dev/md1 level=raid1 num-devices=2 devices=/dev/hd[cg]1 UUID=170960f6:4f175a32:7fc98f9c:70889186 ARRAY /dev/md10 level=raid5 num-devices=4 devices=/dev/hd[aceg]5 UUID=c00eb0ba:b16fc743:89896896:fd26ad33 ARRAY /dev/md11 level=raid5 num-devices=4 devices=/dev/hd[aceg]6 UUID=9fdbdb21:d670f738:38578622:6eb972a6 ARRAY /dev/md12 level=raid5 num-devices=4 devices=/dev/hd[aceg]7 UUID=a6be8d41:ac89a245:c5933584:a34c710e ARRAY /dev/md13 level=raid5 num-devices=4 devices=/dev/hd[aceg]8 UUID=e3a4fab0:ef909a59:982fe73c:7e41cd84 ARRAY /dev/md14 level=raid5 num-devices=4 devices=/dev/hd[aceg]9 UUID=e6449904:51f246db:e896c64c:c692f28c After upgrading to 10.2, md1 was assembled as md6, md10 as md3, md11 as md4 and md12 as md5. md13 and md14 did not change. The changes indicate that, despite having an mdadm.conf with UUID entries, the array was assembled to an earlier configuration, i.e. the one before the extension with partitions 8 and 9 because of bigger drives. I changed all raid5 md devices to start from minor 10 back then when I added md13 and md14. The settings in mdadm.conf were obviously ignored. Instead the md minor number stored in the md superblock from the inital md creation was probably used. Please note that the md minors were always (say, reproducible) assembled to the same "wrong" (stored?) number. I had to stop the array and update the super-minor as described in my initial report to resolve the issue. I'm sorry but I don't have any logs because I was working in single user mode only to resolve the problem. Btw, LVM did manage to assemble all LVs (with some warnings, of course) despite some PVs lived on the "wrong" md device. :-) Finally, the steps to reproduce the problem are probably: * Create md devices with, I don't know, 10.1, 10.0, 9.3 or earlier? Can't recall when I moved to raid, maybe 3-4 years ago. * Change the md minors in /etc/mdadm.conf This should work in < 10.2. * Boot 10.2 and the md devices should be assembled with the minors from the initial creation. Please tell me if you can reproduce the problem with the steps above. I've had the problem with two completely different machines, one at home (setup above after upgrading 10.1 to 10.2) and one at work with just raid1 mirrors (running 10.1, booting with 10.2 rescue gave different md minors which broke the system backup scripts). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #3 from mmarek@novell.com 2007-01-08 06:55 MST ------- I still can't reproduce it :-( Eg. on 10.2 I've (on a testing machine with just one disk): # mdadm -E /dev/hda{10..13} | grep Preferred Preferred Minor : 7 Preferred Minor : 7 Preferred Minor : 6 Preferred Minor : 6 but the arrays are still correctly assembled according to /etc/mdadm.conf: # cat /etc/mdadm.conf DEVICE /dev/hda[6789] /dev/hda1* ARRAY /dev/md0 level=raid1 UUID=8ffc6916:6b12842c:be2775f4:33af64ec devices=/dev/hda[67] ARRAY /dev/md1 level=raid1 UUID=c624765c:ede72818:1a618fac:cf0fd3ca devices=/dev/hda[89] ARRAY /dev/md20 level=raid1 UUID=e3dc171e:952e6323:6d298914:5c15425c devices=/dev/hda1[23] ARRAY /dev/md21 level=raid1 UUID=a3f26537:b070ec51:51a34dd9:67206e36 devices=/dev/hda1[01] # cat /proc/mdstat Personalities : [raid1] md21 : active raid1 hda10[0] hda11[1] 513984 blocks [2/2] [UU] md20 : active raid1 hda12[0] hda13[1] 513984 blocks [2/2] [UU] md1 : active raid1 hda8[0] hda9[1] 513984 blocks [2/2] [UU] md0 : active raid1 hda6[0] hda7[1] 513984 blocks [2/2] [UU] unused devices: <none> Neil, perhaps you have an idea what could be wrong here? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #4 from walter.haidinger@gmx.at 2007-01-08 15:00 MST ------- Please create the array under 10.1 and then see how 10.2 assembles it. I'll verify the machine at work tomorrow and post the results. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #5 from nfbrown@novell.com 2007-01-08 17:09 MST ------- My guess is that the initrd has some old information in it. In 2.6, if you assemble an array as a different device (i.e. a different minor number) the new minor number gets written to the superblock so if you have ever assembled with the new device names the old number will have been over-written. Also, listing 'devices=' as well as 'uuid=' in mdadm.conf is normally not a good idea as if the devices change name the array would not get assembled properly. However I don't think that is causing the current problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #6 from walter.haidinger@gmx.at 2007-01-09 02:08 MST ------- initrd is unlikely. At home I don't use initrd at all (added all required drivers for booting to kernel config) and the machine at work got different minors when I booted with the 10.2 rescue(!) system. I'll try to reproduce the latter. I'm listing both devices= and uuid= deliberately because I want mdadm.conf to match my _current_ raid configuration, i.e. disabling autoconfig to some extent. That is, if my raid config changes and I forget to update mdadm.conf, the arrays will _not_ be assembled properly which I will in all likelihood notice. ;-) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 walter.haidinger@gmx.at changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|walter.haidinger@gmx.at | ------- Comment #7 from walter.haidinger@gmx.at 2007-01-09 05:58 MST ------- Just verified: The md minors _are_ assigned differently! Setup running SuSE 10.1: # uname -a Linux vfs3a 2.6.16.27-0.6-smp #1 SMP Wed Dec 13 09:34:50 UTC 2006 i686 i686 i386 GNU/Linux # cat /proc/mdstat Personalities : [raid1] [raid0] [raid5] [raid4] [linear] md8 : active raid1 hda8[0] hdc8[1] 53978752 blocks [2/2] [UU] md7 : active raid1 hda7[0] hdc7[1] 19531392 blocks [2/2] [UU] md5 : active raid1 hda5[0] hdc5[1] 977152 blocks [2/2] [UU] md6 : active raid1 hda6[0] hdc6[1] 3418496 blocks [2/2] [UU] md1 : active raid1 hda1[0] hdc1[1] 244288 blocks [2/2] [UU] unused devices: <none> # cat /etc/mdadm.conf MAILADDR root@localhost DEVICE /dev/hd[ac]* ARRAY /dev/md1 level=raid1 num-devices=2 UUID=b24674aa:96deba97:142cf4ed:5198e303 ARRAY /dev/md5 level=raid1 num-devices=2 UUID=8b84dadc:f6b5b1f4:8d60fa0f:75f43b8b ARRAY /dev/md6 level=raid1 num-devices=2 UUID=67c543ff:e6943545:9a7444f8:6da0e118 ARRAY /dev/md7 level=raid1 num-devices=2 UUID=79dbb520:5c91fb25:e097a699:2c618cbb ARRAY /dev/md8 level=raid1 num-devices=2 UUID=897852f3:01c08a37:7993e66d:fc888ad9 # mdadm --misc --detail /dev/md{1,5,6,7,8} | grep Preferred Preferred Minor : 1 Preferred Minor : 5 Preferred Minor : 6 Preferred Minor : 7 Preferred Minor : 8 Just installed the kernel update with a newly created initrd. However, here is the configuration when booting with the 10.2 rescue system: # uname -a Linux Rescue 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006 i686 i686 i386 GNU/Linux mdadm.conf: Personalities : [raid1] md4 : active raid1 hda8[0] hdc8[1] 53978752 blocks [2/2] [UU] md3 : active raid1 hda7[0] hdc7[1] 19531392 blocks [2/2] [UU] md2 : active raid1 hda6[0] hdc6[1] 3418496 blocks [2/2] [UU] md1 : active raid1 hda5[0] hdc5[1] 977152 blocks [2/2] [UU] md0 : active raid1 hda1[0] hdc1[1] 244288 blocks [2/2] [UU] unused devices: <none> # mdadm --misc --detail /dev/md* 2> /dev/null | grep Preferred Preferred Minor : 0 Preferred Minor : 1 Preferred Minor : 2 Preferred Minor : 3 Preferred Minor : 4 Obviously the minors from the running 10.1 are not honored and new minors are assigned. FYI: all partitions have id 0xfd. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #8 from mmarek@novell.com 2007-01-09 07:27 MST ------- Ok with the 10.2 rescue system (that is w/o any mdadm.conf) it seems like I can reproduce it -> boot.md runs /sbin/mdrun and this script somehow doesn't get it right... I'll look into it further... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #9 from walter.haidinger@gmx.at 2007-01-09 09:01 MST ------- Is boot.md of the 10.2 rescue system identical to the one in the 10.2 basesystem? If so, that would explain the different minors at my home system because I do call boot.md there too (also in my single-user maintenance script which mounts / rw, starts LVM, etc). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #10 from mmarek@novell.com 2007-01-09 09:26 MST ------- Created an attachment (id=112035) --> (https://bugzilla.novell.com/attachment.cgi?id=112035&action=view) boot.md (In reply to comment #9)
Is boot.md of the 10.2 rescue system identical to the one in the 10.2 basesystem?
Yes, they are the same. Could you try the attached boot.md script? It tries 'raidautorun' (as it was done in 10.0 and older) before 'mdrun', so it will let the kernel autodetect the arrays, which should honour the preferred minor field in the device's superblocks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 mmarek@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |walter.haidinger@gmx.at -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #11 from walter.haidinger@gmx.at 2007-01-09 09:45 MST ------- I will. Running raidautorun in advance should fix the problem, though (I think). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 walter.haidinger@gmx.at changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|walter.haidinger@gmx.at | ------- Comment #12 from walter.haidinger@gmx.at 2007-01-09 13:12 MST ------- The new script works. Booting with the 10.2 rescue system gave me m0 to md6. After manually stopping the arrays, I called the script from comment #10 and it correctly assembled md0, md1 and md10 to md14. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 mmarek@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #13 from mmarek@novell.com 2007-01-10 04:09 MST ------- Ok, thanks. I submitted the fixed boot.md to Factory, so I'm marking this bug as FIXED. If you still find some of the issues mentioned in this bug in Factory or 10.3 alphas, then please REOPEN. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #14 from lmb@novell.com 2007-01-10 05:10 MST ------- I thought raidautorun was depreciated? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #15 from mmarek@novell.com 2007-01-10 05:20 MST ------- (In reply to comment #14)
I thought raidautorun was depreciated?
It was taken from deprecated and dropped raittools, yes, but it's only a few lines of C code that calls one ioctl. What are the objections against it? Or is the kernel part behind the RAID_AUTORUN ioctl deprecated? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=230733 ------- Comment #16 from mmarek@novell.com 2007-01-17 08:39 MST ------- Lars? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com