[opensuse] boot after installation problem
Hi Folks, I'm having a problem installing 10.3 x86_64 on a box with many raid disk controllers and disks. It has two disks mounted in the back that are supposed to be the system disks, configured as a raid1 mirror. Alas, this disk shows up after all the other disks, in this particular case as /dev/sde instead of /dev/sda. A full install goes well until the first boot. The boot fails saying it can't find /dev/sde3 (the root partition). It eventually drops into a limited sh shell running out of ram. This box worked ok when the system disk appeared as /dev/sdc. I wonder if there is some bug about booting from disks too far removed from /dev/sda? I've got a call in to the manufacturer to see if there's a way to have the system disks show up as /dev/sda, but no word from them yet. Anyone seen this behavior? The controllers are 3-Ware, the disks SATA, if that matters. Thanks, Lew Wolfgang -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Lew Wolfgang wrote:
Hi Folks,
I'm having a problem installing 10.3 x86_64 on a box with many raid disk controllers and disks. It has two disks mounted in the back that are supposed to be the system disks, configured as a raid1 mirror. Alas, this disk shows up after all the other disks, in this particular case as /dev/sde instead of /dev/sda.
A full install goes well until the first boot. The boot fails saying it can't find /dev/sde3 (the root partition). It eventually drops into a limited sh shell running out of ram.
This box worked ok when the system disk appeared as /dev/sdc. I wonder if there is some bug about booting from disks too far removed from /dev/sda?
I've got a call in to the manufacturer to see if there's a way to have the system disks show up as /dev/sda, but no word from them yet.
Anyone seen this behavior? The controllers are 3-Ware, the disks SATA, if that matters.
I don't have an answer for you nor could I find anything except that apparently the company has been bought out by a non-Linux friend company (AMCC) and there are performance complaints now showing up. Here's a reference: http://lxer.com/module/forums/t/26605/ Fred -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Fred A. Miller wrote: [snip]
Anyone seen this behavior? The controllers are 3-Ware, the disks SATA, if that matters.
I don't have an answer for you nor could I find anything except that apparently the company has been bought out by a non-Linux friend company (AMCC) and there are performance complaints now showing up. Here's a reference: http://lxer.com/module/forums/t/26605/
'Don't know if this is any help or not, but visit: http://www.howtoforge.com/software-raid1-grub-boot-debian-etch-p3 There might be some assistance....hope so. Fred -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 2/6/08 11:58 PM, "Fred A. Miller"
Lew Wolfgang wrote:
Hi Folks,
I'm having a problem installing 10.3 x86_64 on a box with many raid disk controllers and disks. It has two disks mounted in the back that are supposed to be the system disks, configured as a raid1 mirror. Alas, this disk shows up after all the other disks, in this particular case as /dev/sde instead of /dev/sda.
A full install goes well until the first boot. The boot fails saying it can't find /dev/sde3 (the root partition). It eventually drops into a limited sh shell running out of ram.
This box worked ok when the system disk appeared as /dev/sdc. I wonder if there is some bug about booting from disks too far removed from /dev/sda?
I've got a call in to the manufacturer to see if there's a way to have the system disks show up as /dev/sda, but no word from them yet.
Anyone seen this behavior? The controllers are 3-Ware, the disks SATA, if that matters.
I don't have an answer for you nor could I find anything except that apparently the company has been bought out by a non-Linux friend company (AMCC) and there are performance complaints now showing up. Here's a reference: http://lxer.com/module/forums/t/26605/
I'm running several dozen servers with 3ware cards, all of them running opensuse 10.1 to opensuse 10.3. The kernel modules for those cards have been updated consistently for their latest hardware, backwards compatibility for new cards has always worked, and they've made a point to keep all of their management tools (3dm2) up to date on Linux as well as Windows and other platforms. As far as vendors in the Linux world go, they're about as close to the top of the heap as anyone. Whatever Sander Marechal's complaints, they don't have any relation to my real-world experience. That said: I've had problems like yours in the past. At root, it has to do with something seemingly ridiculous: alphabetical order. When the installation system is probing for hardware, it does so by loading each kernel module and seeing whether any new hardware appears. (Or something else, the effect of which is identical to this.) Because the 3ware modules are loaded very early -- 3w-9xxx.so is, after all, very early in alphabetical order -- 3ware disks almost always show up as /dev/sda. However, this order does not hold when the initrd for the installed system is created. There's an explicit order that's created in that case. Within the initrd filesystem, at the root level, there's a file called "init". It's basically a shell script, and is created dynamically when the initrd is created at the end of the installation process. It includes a section that loads the various modules. Somehow the 3w_9xxx kernel module always ends up being loaded after others, especially the modules for onboard SATA controllers. The only fix that I've found is to rebuild the initrd. Fortunately, it's pretty easy: - Boot from a rescue disk; - chroot to the installed system; - edit /etc/sysconfig/kernel; for the parameter "INITRD_MODULES", make sure that "3w_9xxx" is listed before any other scsi/sata/ata/etc modules. - do "mkinitrd" I do almost all of my installs via autoyast, and have had this problem off and on with builds from suse 9.1 through 10.3. At one time, I actually had a chroot-script (run after the installation of the system but before the first reboot, chrooted within the installed system) that would do this. For opensuse 10.2 I found that I didn't need it, but I'm about to start to set up the 10.3 autobuild for these servers, so who knows... In my case, my machines almost all have Silicon Image controllers onboard, in addition to the 3ware cards. So the chroot script was something basically like this (I'll see if I can dig the exact script out of the subversion repository where I keep my autoyast configs tomorrow): #!/bin/sh ## Alter the INITRD_MODULES line appropriately sed -i s/sata_sil 3w_9xxx/3w_9xxx sata_sil/g" /etc/sysconfig/kernel ## Rebuild initrd mkinitrd exit 0 FWIW, my personal opinion is that the bug lies in initrd/mkinitrd/the initrd creation process, and not the card; that is, the process of creating the initrd should be smarter about disk detection order, and force a load of the module needed for the disk on which the bootloader is installed before any others. You _might_ be able to duplicate the effect of this rebuild by specifying "insmod=3w_9xxx" on the kernel command line, in the grub boot menu, but I won't swear to that. I know, I know, it's absurd. But those 3ware cards are reliable, very fast, and 3ware/AMCC's service has been outstanding (I've gotten advance-replace cards shipped same day, at no cost, via American Airlines Cargo when FedEx just wasn't fast enough.) I've entirely given up on the idea of using Promise or Adaptec RAID controller cards at this point because I've been so impressed with them. On the downside, documentation could be a lot better (I suspect that a search of my name and the term "3ware" on both this list and the suse-autoinstall list would turn up a few hits), but, well...once you've played with the cards a bit... - Ian -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Ian Marlier pecked at the keyboard and wrote:
On 2/6/08 11:58 PM, "Fred A. Miller"
wrote: Lew Wolfgang wrote:
Hi Folks,
I'm having a problem installing 10.3 x86_64 on a box with many raid disk controllers and disks. It has two disks mounted in the back that are supposed to be the system disks, configured as a raid1 mirror. Alas, this disk shows up after all the other disks, in this particular case as /dev/sde instead of /dev/sda.
A full install goes well until the first boot. The boot fails saying it can't find /dev/sde3 (the root partition). It eventually drops into a limited sh shell running out of ram.
This box worked ok when the system disk appeared as /dev/sdc. I wonder if there is some bug about booting from disks too far removed from /dev/sda?
I've got a call in to the manufacturer to see if there's a way to have the system disks show up as /dev/sda, but no word from them yet.
Anyone seen this behavior? The controllers are 3-Ware, the disks SATA, if that matters. I don't have an answer for you nor could I find anything except that apparently the company has been bought out by a non-Linux friend company (AMCC) and there are performance complaints now showing up. Here's a reference: http://lxer.com/module/forums/t/26605/
I'm running several dozen servers with 3ware cards, all of them running opensuse 10.1 to opensuse 10.3. The kernel modules for those cards have been updated consistently for their latest hardware, backwards compatibility for new cards has always worked, and they've made a point to keep all of their management tools (3dm2) up to date on Linux as well as Windows and other platforms.
As far as vendors in the Linux world go, they're about as close to the top of the heap as anyone. Whatever Sander Marechal's complaints, they don't have any relation to my real-world experience.
That said: I've had problems like yours in the past.
At root, it has to do with something seemingly ridiculous: alphabetical order. When the installation system is probing for hardware, it does so by loading each kernel module and seeing whether any new hardware appears. (Or something else, the effect of which is identical to this.) Because the 3ware modules are loaded very early -- 3w-9xxx.so is, after all, very early in alphabetical order -- 3ware disks almost always show up as /dev/sda.
However, this order does not hold when the initrd for the installed system is created. There's an explicit order that's created in that case.
Within the initrd filesystem, at the root level, there's a file called "init". It's basically a shell script, and is created dynamically when the initrd is created at the end of the installation process. It includes a section that loads the various modules. Somehow the 3w_9xxx kernel module always ends up being loaded after others, especially the modules for onboard SATA controllers.
The only fix that I've found is to rebuild the initrd. Fortunately, it's pretty easy: - Boot from a rescue disk; - chroot to the installed system; - edit /etc/sysconfig/kernel; for the parameter "INITRD_MODULES", make sure that "3w_9xxx" is listed before any other scsi/sata/ata/etc modules. - do "mkinitrd"
I do almost all of my installs via autoyast, and have had this problem off and on with builds from suse 9.1 through 10.3. At one time, I actually had a chroot-script (run after the installation of the system but before the first reboot, chrooted within the installed system) that would do this. For opensuse 10.2 I found that I didn't need it, but I'm about to start to set up the 10.3 autobuild for these servers, so who knows...
In my case, my machines almost all have Silicon Image controllers onboard, in addition to the 3ware cards. So the chroot script was something basically like this (I'll see if I can dig the exact script out of the subversion repository where I keep my autoyast configs tomorrow):
#!/bin/sh ## Alter the INITRD_MODULES line appropriately sed -i s/sata_sil 3w_9xxx/3w_9xxx sata_sil/g" /etc/sysconfig/kernel ## Rebuild initrd mkinitrd exit 0
FWIW, my personal opinion is that the bug lies in initrd/mkinitrd/the initrd creation process, and not the card; that is, the process of creating the initrd should be smarter about disk detection order, and force a load of the module needed for the disk on which the bootloader is installed before any others.
Have you filed a bug report with this info? Sounds very similar to problems many people are having, install order of disks is different then boot order of disks. -- Ken Schneider SuSe since Version 5.2, June 1998 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Lew Wolfgang wrote:
Hi Folks,
I'm having a problem installing 10.3 x86_64 on a box with many raid disk controllers and disks. It has two disks mounted in the back that are supposed to be the system disks, configured as a raid1 mirror. Alas, this disk shows up after all the other disks, in this particular case as /dev/sde instead of /dev/sda.
In general, if you want the system disks mirrored in hardware RAID, then you should get a motherboard which has a RAID controller on it.
A full install goes well until the first boot. The boot fails saying it can't find /dev/sde3 (the root partition). It eventually drops into a limited sh shell running out of ram.
Not surprising. If the module for the RAID card isn't in the initrd, then you're not going to boot successfully.
This box worked ok when the system disk appeared as /dev/sdc. I wonder if there is some bug about booting from disks too far removed from /dev/sda?
Probably.
I've got a call in to the manufacturer to see if there's a way to have the system disks show up as /dev/sda, but no word from them yet.
At this point, I would say, re-arrange your disks, and use software RAID until you can find a motherboard which has built-in hardware raid for /dev/sda and /dev/sdb. It will eliminate a lot of headaches.
Anyone seen this behavior? The controllers are 3-Ware, the disks SATA, if that matters.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Lew Wolfgang wrote:
I'm having a problem installing 10.3 x86_64 on a box with many raid disk controllers and disks. It has two disks mounted in the back that are supposed to be the system disks, configured as a raid1 mirror. Alas, this disk shows up after all the other disks, in this particular case as /dev/sde instead of /dev/sda.
A full install goes well until the first boot. The boot fails saying it can't find /dev/sde3 (the root partition). It eventually drops into a limited sh shell running out of ram.
This box worked ok when the system disk appeared as /dev/sdc. I wonder if there is some bug about booting from disks too far removed from /dev/sda?
Hi Folks, I've got some more to report on my first-boot failure on a fresh 10.3 install. Someone, (sorry, I couldn't find your email) suggested booting with "acpi=off irqpool". This had no effect. Then, I tried Ian Marller's suggestion of editing /etc/sysconfig/kernel from a rescue boot. The mkinitrd invocation failed (yes, I chrooted). Adding 3w_xxxx and such to the kernel boot line also had no effect. Soooo, recall that this is the first boot during the install procedure on the 10.3 DVD, right after all the packages have been loaded. There isn't too much I could have done up to this point to have fouled things up, so this really smells like a bug. I've got three three 8-port 3-ware controllers (24-SATA disks total) and one 2-port 3-ware controller for the system disk. The current configuration has five raid units, and are reported as sda through sde in the install expert partition screen. The 2-disk system raid appears as sde to the partitioner, and is configured with sde1 (boot) sde2 (swap) sde3 (/) and sde4 (/export/home). There doesn't seem to be any way for me to configure things so that the 2-disk raid shows up as sda, where it should be. Now, here's some further info: After the failed boot it's possible to escape to a very limited shell. At this point the boot failed because it couldn't find sde3, the system root partition. From this limited shell I looked at /dev/sd* and found that sda1-4 exists, but only one partition on sde1. So it looks like my system partition is now showing up as sda!!! WTFO? /etc/fstab had the system partitions on sde, so I booted the rescue system again and edited fstab, changing sde to sda. Alas, this also didn't work. Not being a grub expert, I'm now at a loss. Any suggestions? Thanks, Lew Wolfgang -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hi Folks, (forgive the top-post, I think it appropriate in this case) I figured out an acceptable solution for the wandering disk designations. The "expert" partition manager in the installation process has an option that I haven't used before that allows mounting of partitions by label or uuid. I've used labels in the past for this kind of server, but only after it was up and running. The partition manager allows configuration during install time. The system disk device still changed from sde to sda, but it didn't matter. I created boot, swap, root and home partition labels, the system automatically mounted the other raid arrays by uuid. I'm happy and up again. Thanks for the suggestions. Regards, Lew Wolfgang Lew Wolfgang wrote:
Lew Wolfgang wrote:
I'm having a problem installing 10.3 x86_64 on a box with many raid disk controllers and disks. It has two disks mounted in the back that are supposed to be the system disks, configured as a raid1 mirror. Alas, this disk shows up after all the other disks, in this particular case as /dev/sde instead of /dev/sda.
A full install goes well until the first boot. The boot fails saying it can't find /dev/sde3 (the root partition). It eventually drops into a limited sh shell running out of ram.
This box worked ok when the system disk appeared as /dev/sdc. I wonder if there is some bug about booting from disks too far removed from /dev/sda?
Hi Folks,
I've got some more to report on my first-boot failure on a fresh 10.3 install.
Someone, (sorry, I couldn't find your email) suggested booting with "acpi=off irqpool". This had no effect.
Then, I tried Ian Marller's suggestion of editing /etc/sysconfig/kernel from a rescue boot. The mkinitrd invocation failed (yes, I chrooted). Adding 3w_xxxx and such to the kernel boot line also had no effect.
Soooo, recall that this is the first boot during the install procedure on the 10.3 DVD, right after all the packages have been loaded. There isn't too much I could have done up to this point to have fouled things up, so this really smells like a bug.
I've got three three 8-port 3-ware controllers (24-SATA disks total) and one 2-port 3-ware controller for the system disk. The current configuration has five raid units, and are reported as sda through sde in the install expert partition screen. The 2-disk system raid appears as sde to the partitioner, and is configured with sde1 (boot) sde2 (swap) sde3 (/) and sde4 (/export/home). There doesn't seem to be any way for me to configure things so that the 2-disk raid shows up as sda, where it should be.
Now, here's some further info: After the failed boot it's possible to escape to a very limited shell. At this point the boot failed because it couldn't find sde3, the system root partition. From this limited shell I looked at /dev/sd* and found that sda1-4 exists, but only one partition on sde1. So it looks like my system partition is now showing up as sda!!! WTFO? /etc/fstab had the system partitions on sde, so I booted the rescue system again and edited fstab, changing sde to sda. Alas, this also didn't work.
Not being a grub expert, I'm now at a loss. Any suggestions?
Thanks, Lew Wolfgang
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 02/08/2008 05:49 AM, Lew Wolfgang wrote:
I've got three three 8-port 3-ware controllers (24-SATA disks total) and one 2-port 3-ware controller for the system disk. The current configuration has five raid units, and are reported as sda through sde in the install expert partition screen. The 2-disk system raid appears as sde to the partitioner, and is configured with sde1 (boot) sde2 (swap) sde3 (/) and sde4 (/export/home). There doesn't seem to be any way for me to configure things so that the 2-disk raid shows up as sda, where it should be.
Now, here's some further info: After the failed boot it's possible to escape to a very limited shell. At this point the boot failed because it couldn't find sde3, the system root partition. From this limited shell I looked at /dev/sd* and found that sda1-4 exists, but only one partition on sde1. So it looks like my system partition is now showing up as sda!!! WTFO? /etc/fstab had the system partitions on sde, so I booted the rescue system again and edited fstab, changing sde to sda. Alas, this also didn't work.
Not being a grub expert, I'm now at a loss. Any suggestions?
Edit /boot/grub/menu.lst, and correct the root= parameter to the correct disk. -- Joe Morris Registered Linux user 231871 running openSUSE 10.3 x86_64 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (6)
-
Aaron Kulkis
-
Fred A. Miller
-
Ian Marlier
-
Joe Morris
-
Ken Schneider
-
Lew Wolfgang