[opensuse] mdadm - strange problem: After being moved, the system won't boot with std OpenSuse 10.2 kernel
After having copied a OpenSuse 10.2 system with software-raid ( disk-mirror with 2 IDE harddisks) to a different set of harddisks, Linux can no longer boot on the new disks, when I boot with the standard Open SuSE 10.2 kernel. The error message is from MD saying that it can not find any devices for my three raid partions /dev/md0 (swap) , /dev/md1 (/) and /dev/md2 (/home) md: No device found for /dev/md0... and the boot sequence halts. The strange thing is that if I boot the system with a standard Linux 2.6.19.1 kernel, also on the harddisk (no module support), or if I connect the harddisk to a MS Virtual PC, it boots just fine. The problem has apparently something to do with the OpenSuSE 10.2 std kernel. If I boot the rescue system from the dvd, it also looks just fine ! (cat /dev/mdstat). On the harddisk where the system was originally copied from, it works perfectly. Any suggestions Thanks in advance /Bo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday 21 January 2007 09:45, Bo Jacobsen wrote:
After having copied a OpenSuse 10.2 system with software-raid ( disk-mirror with 2 IDE harddisks) to a different set of harddisks, Linux can no longer boot on the new disks, when I boot with the standard Open SuSE 10.2 kernel.
The error message is from MD saying that it can not find any devices for my three raid partions /dev/md0 (swap) , /dev/md1 (/) and /dev/md2 (/home)
md: No device found for /dev/md0...
and the boot sequence halts. <snip> On the harddisk where the system was originally copied from, it works perfectly.
Any suggestions
Thanks in advance /Bo
Sounds like grub doesn't know the new system. Have you stepped through grub config when you are booted to the rescue system? Your previous grub setup may not work on the new setup. You may just have grub re-read the disk setup and install that. Stan -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday 21 January 2007 09:45, Bo Jacobsen wrote:
After having copied a OpenSuse 10.2 system with software-raid ( disk-mirror with 2 IDE harddisks) to a different set of harddisks, Linux can no longer boot on the new disks, when I boot with the standard Open SuSE 10.2 kernel.
The error message is from MD saying that it can not find any devices for my three raid partions /dev/md0 (swap) , /dev/md1 (/) and /dev/md2 (/home)
md: No device found for /dev/md0...
and the boot sequence halts.
<snip>
On the harddisk where the system was originally copied from, it works perfectly.
Any suggestions
Thanks in advance /Bo
Sounds like grub doesn't know the new system. Have you stepped through grub config when you are booted to the rescue system? Your previous grub setup may not work on the new setup. You may just have grub re-read the disk setup and install that.
Stan
grub root (hd0,1) setup (hd0) on the harddisk so it should be OK. Besides the 2.19.1 kernel on the system boots without any problems. Both kernels are defined in /boot/grub/menu.lst: ..
I have done a: title Linux 2.6.19.1 # This always works, no matter the hardware. root (hd0,1) kernel /boot/boot.2.19 root=/dev/md1 title Default OpenSuSE 10.2 # Do not work, but if harddisk is attached to MS Virtual PC then it works !? root (hd0,1) kernel /boot/vmlinuz-2.6.18.2-34-default root=/dev/md1 initrd /boot/initrd-2.6.18.2-34-default When booting the default SuSE kernel, the following is written to the screen: Loading liniar md: Liniar personality registered for level1 Loading JBD Loading mbcache Loading ext3 md: MD1: No device found for /dev/md1 Waiting for device /dev/md1 to appear: OK /dev/md1: Unknown volume type Invalid boot file system - exiting to /bin/sh. .. And then the boot stops. The same harddisk mounted on a MS Virtual PC and booted with the same Grub bootline, boots as expected, and makes the following lines in boot.msg: .. Loading liniar Loading mbcache Loading ext3 mdadm: /dev/md0 has been started with 1 drive (out of 2). mdadm: /dev/md1 has been started with 1 drive (out of 2). ... /Bo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sunday 21 January 2007 17:02, Bo Jacobsen wrote:
I have done a:
grub root (hd0,1) setup (hd0)
on the harddisk so it should be OK. Besides the 2.19.1 kernel on the system boots without any problems. Both kernels are defined in /boot/grub/menu.lst: .. title Linux 2.6.19.1 # This always works, no matter the hardware. root (hd0,1) kernel /boot/boot.2.19 root=/dev/md1
title Default OpenSuSE 10.2 # Do not work, but if harddisk is attached to MS Virtual PC then it works !? root (hd0,1) kernel /boot/vmlinuz-2.6.18.2-34-default root=/dev/md1 initrd /boot/initrd-2.6.18.2-34-default
When booting the default SuSE kernel, the following is written to the screen: Loading liniar md: Liniar personality registered for level1 Loading JBD Loading mbcache Loading ext3 md: MD1: No device found for /dev/md1 Waiting for device /dev/md1 to appear: OK /dev/md1: Unknown volume type Invalid boot file system - exiting to /bin/sh. .. And then the boot stops.
The same harddisk mounted on a MS Virtual PC and booted with the same Grub bootline, boots as expected, and makes the following lines in boot.msg: .. Loading liniar Loading mbcache Loading ext3 mdadm: /dev/md0 has been started with 1 drive (out of 2). mdadm: /dev/md1 has been started with 1 drive (out of 2). ... /Bo
Since the data is still intact and you have 2 different methods of booting this RAID array I think its something very basic that you are overlooking. Been there done that myself. Are you sure of the grub parameters on the raid array? You're showing us your selected lines of /boot/grub/menu.lst and not the whole thing so we can't spot any potential errors/differences. Is there an initrd line for the boot-2.19 kernel? What other params are we not seeing? One nit to pick also. Why put /swap on the RAID array? I'd advocate having two separate non-RAID /swap files on the bare metal. But then, that's just me. /swap is slow enough compared to main memory but then you also want to write that transient data again to another drive and slow the rest of the system down, again? Sure modern systems are fast but I look at it as unneeded wear and tear. Stan Stan -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 S Glasoe wrote: ...
One nit to pick also. Why put /swap on the RAID array? I'd advocate having two separate non-RAID /swap files on the bare metal. But then, that's just me. /swap is slow enough compared to main memory but then you also want to write that transient data again to another drive and slow the rest of the system down, again? Sure modern systems are fast but I look at it as unneeded wear and tear.
Well... on a side note and it doesn't help the OP but... it does make
sense to put swap on RAID.
Say a disk dies. Yay, keeps on running because you have the 2nd disk in
the array (let's suppose RAID1).
But if you have something in swap your system will crash when it loads
the swapped-out pages into memory again, because the swap isn't on RAID
(supposing the "dead disk" is also full of bad blocks).
Having swap on RAID means that the system actually keeps on running when
a disk dies as the swapped out pages will be read from the clean disk in
the degraded array.
cheers
- --
-o) Pascal Bleser http://linux01.gwdg.de/~pbleser/
/\\
On Sunday 21 January 2007 19:25, Pascal Bleser wrote:
S Glasoe wrote: ...
One nit to pick also. Why put /swap on the RAID array? I'd advocate having two separate non-RAID /swap files on the bare metal. But then, that's just me. /swap is slow enough compared to main memory but then you also want to write that transient data again to another drive and slow the rest of the system down, again? Sure modern systems are fast but I look at it as unneeded wear and tear.
Well... on a side note and it doesn't help the OP but... it does make sense to put swap on RAID.
Say a disk dies. Yay, keeps on running because you have the 2nd disk in the array (let's suppose RAID1). But if you have something in swap your system will crash when it loads the swapped-out pages into memory again, because the swap isn't on RAID (supposing the "dead disk" is also full of bad blocks).
Having swap on RAID means that the system actually keeps on running when a disk dies as the swapped out pages will be read from the clean disk in the degraded array.
cheers Pascal Bleser
Excellent point. Thanks, Stan -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
greetz
this what ive been doing with grub + mdadm
-install the system with the normal yast installation, creating the
raid partitions etc
-after it has finnished installing, i reboot with knoppix
- in knoppix, i mount the first partition (/boot) to /mnt and cd /mnt
- there i reconfigure with the grub, entering
root (hd0,0)
setup (hd0)
root (hd1,0)
setup (hd1)
-after that i unmount the /mnt
- i reboot the system and suse would boot fine after that without probs
the problem you are most propably facing is that the device names have
changed and that way mdadm and grub are not recongizing the partitions
to boot, so start with that,
boot with knoppix and see as what devices the drives are visible as,
then make sure that mdadm arrays and grub match, if not correct.
good luck
On 1/22/07, S Glasoe
On Sunday 21 January 2007 17:02, Bo Jacobsen wrote:
I have done a:
grub root (hd0,1) setup (hd0)
on the harddisk so it should be OK. Besides the 2.19.1 kernel on the system boots without any problems. Both kernels are defined in /boot/grub/menu.lst: .. title Linux 2.6.19.1 # This always works, no matter the hardware. root (hd0,1) kernel /boot/boot.2.19 root=/dev/md1
title Default OpenSuSE 10.2 # Do not work, but if harddisk is attached to MS Virtual PC then it works !? root (hd0,1) kernel /boot/vmlinuz-2.6.18.2-34-default root=/dev/md1 initrd /boot/initrd-2.6.18.2-34-default
When booting the default SuSE kernel, the following is written to the screen: Loading liniar md: Liniar personality registered for level1 Loading JBD Loading mbcache Loading ext3 md: MD1: No device found for /dev/md1 Waiting for device /dev/md1 to appear: OK /dev/md1: Unknown volume type Invalid boot file system - exiting to /bin/sh. .. And then the boot stops.
The same harddisk mounted on a MS Virtual PC and booted with the same Grub bootline, boots as expected, and makes the following lines in boot.msg: .. Loading liniar Loading mbcache Loading ext3 mdadm: /dev/md0 has been started with 1 drive (out of 2). mdadm: /dev/md1 has been started with 1 drive (out of 2). ... /Bo
Since the data is still intact and you have 2 different methods of booting this RAID array I think its something very basic that you are overlooking. Been there done that myself. Are you sure of the grub parameters on the raid array? You're showing us your selected lines of /boot/grub/menu.lst and not the whole thing so we can't spot any potential errors/differences. Is there an initrd line for the boot-2.19 kernel? What other params are we not seeing?
One nit to pick also. Why put /swap on the RAID array? I'd advocate having two separate non-RAID /swap files on the bare metal. But then, that's just me. /swap is slow enough compared to main memory but then you also want to write that transient data again to another drive and slow the rest of the system down, again? Sure modern systems are fast but I look at it as unneeded wear and tear.
Stan
Stan -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
After having copied a OpenSuse 10.2 system with software-raid ( disk-mirror with 2 IDE harddisks) to a different set of harddisks, Linux can no longer boot on the new disks, when I boot with the standard Open SuSE 10.2 kernel.
The error message is from MD saying that it can not find any devices for my three raid partions /dev/md0 (swap) , /dev/md1 (/) and /dev/md2 (/home)
md: No device found for /dev/md0...
and the boot sequence halts.
The strange thing is that if I boot the system with a standard Linux 2.6.19.1 kernel, also on the harddisk (no module support), or if I connect the harddisk to a MS Virtual PC, it boots just fine. The problem has apparently something to do with the OpenSuSE 10.2 std kernel. If I boot the rescue system from the dvd, it also looks just fine ! (cat /dev/mdstat).
On the harddisk where the system was originally copied from, it works perfectly.
I spent a lot of time on this problem, and after a while, a pattern emerged. The problem with booting the default SuSE kernel on raid-partitions, seemed to happen whenever the harddisk was moved to a machine that saw the harddisk CHS with different values. As the problem happened before the real kernel was booted it could not be the SuSE kernel, but rather the initrd. I generated a new initrd, and the SuSE default kernel now starts without any problems at all. Thanks for your input. /Bo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (4)
-
BashLogic Linux
-
Bo Jacobsen
-
Pascal Bleser
-
S Glasoe