Any ideas on where and how I should go about to fix a boot problem? When starting up, the only thing I get in a fresh SuSE 9.2 install is "GRUB stage 2", and there it hangs. Every time. Anders.
On Friday 03 December 2004 1:33 pm, Anders Norrbring wrote:
Any ideas on where and how I should go about to fix a boot problem? When starting up, the only thing I get in a fresh SuSE 9.2 install is "GRUB stage 2", and there it hangs. Every time.
Anders.
One option is to boot DVD or CD1 and do the rescue system. You can redo grub in there. IIRC grub stage 2 is when its reading the /boot and /boot/grub directories in the main partition of the device mentioned in the grub stage 1 in the MBR. The device map (which stage 1 reads to figure out where stage 2 is located) for grub may be wrong in that it points to a no longer valid disc/partition... HTH or gives a hint. Stan
Anders Norrbring wrote:
Any ideas on where and how I should go about to fix a boot problem? When starting up, the only thing I get in a fresh SuSE 9.2 install is "GRUB stage 2", and there it hangs. Every time.
That message comes from /boot/grub/stage2, so grub seems to be properly installed (unless the stage2 file is corrupted, that is). Grub stage 2 is responsible to load the ramdisk (/boot/initrd) and kernel image, then passing control to the kernel to start the actual boot process, so unless /boot/grub/stage2 is corrupt, there is something wrong with the initrd or the kernel image you are trying to use. First step is to see how grub is configured during the install. For this, you'll need to boot a rescue system and mount the root of the installed system at some convenient place, /mnt is always a good choice. (If /mnt does not exist in the rescue system, "mkdir /mnt" here.) So, if you installed the system on hda1, you'll need to run "mount /dev/hda1 /mnt". If you installed /boot on its own partition (say /dev/hda2), you'll need to mount that also: "mount /dev/hda2 /mnt/boot". (If you installed to a SCSI disk, you'll of course want to mount it, so use /dev/sda instead of /dev/hda.) If you installed /usr onto -its- own partition, please mount that as well at /mnt/usr. Now please run the following commands and post the results here: ls -l /mnt/boot ls -l /mnt/boot/grub ls -l /mnt/usr/lib/grub cat /mnt/boot/grub/device.map cat /mnt/boot/grub/menu.lst cat /mnt/etc/fstab (I know a few folks are going to say this is just a bunch of overkill. Maybe it will be, but it might also turn out that we will need all that information to resolve this. Best to have it all right away. Anyway, I would like to learn exactly why this crapped out on you, if I can.) I was going to suggest a few things that you might try at this point, but when I think about it, it's probably useless right now to try to speculate. On my own systems, I can speculate all I want, but here, let's just get you up and running with a minimum of effort.
Anders Norrbring wrote:
Any ideas on where and how I should go about to fix a boot problem? When starting up, the only thing I get in a fresh SuSE 9.2 install is "GRUB stage 2", and there it hangs. Every time.
That message comes from /boot/grub/stage2, so grub seems to be properly installed (unless the stage2 file is corrupted, that is). Grub stage 2 is responsible to load the ramdisk (/boot/initrd) and kernel image, then passing control to the kernel to start the actual boot process, so unless /boot/grub/stage2 is corrupt, there is something wrong with the initrd or the kernel image you are trying to use.
First step is to see how grub is configured during the install. For this, you'll need to boot a rescue system and mount the root of the installed system at some convenient place, /mnt is always a good choice. (If /mnt does not exist in the rescue system, "mkdir /mnt" here.) So, if you installed the system on hda1, you'll need to run "mount /dev/hda1 /mnt". If you installed /boot on its own partition (say /dev/hda2), you'll need to mount that also: "mount /dev/hda2 /mnt/boot". (If you installed to a SCSI disk, you'll of course want to mount it, so use /dev/sda instead of /dev/hda.) If you installed /usr onto -its- own partition, please mount that as well at /mnt/usr.
Now please run the following commands and post the results here:
ls -l /mnt/boot ls -l /mnt/boot/grub ls -l /mnt/usr/lib/grub cat /mnt/boot/grub/device.map cat /mnt/boot/grub/menu.lst cat /mnt/etc/fstab
(I know a few folks are going to say this is just a bunch of overkill. Maybe it will be, but it might also turn out that we will need all that information to resolve this. Best to have it all right away. Anyway, I would like to learn exactly why this crapped out on you, if I can.)
I was going to suggest a few things that you might try at this point, but when I think about it, it's probably useless right now to try to speculate. On my own systems, I can speculate all I want, but here, let's just get you up and running with a minimum of effort.
Thanks Darryl, Here are everything you asked for, in the order you listed above. I myself suspect initrd, I have a Compaq Smart Array 2DH installed, but no drives attached to it (yet) and the system is installed onto the IDE drive. It's just a feeling, I'll pull the Compaq controller and reinstall to see what happens. Anyway, here's the listing: -rw-r--r-- 1 root root 739535 Oct 6 12:17 System.map-2.6.8-24-default -rw-r--r-- 1 root root 472390 Oct 6 11:29 System.map-2.6.8-24-um -rw-r--r-- 1 root root 512 Dec 2 20:54 backup_mbr lrwxrwxrwx 1 root root 1 Dec 2 20:26 boot -> . -rw-r--r-- 1 root root 57527 Oct 6 12:30 config-2.6.8-24-default -rw-r--r-- 1 root root 18071 Oct 6 11:31 config-2.6.8-24-um drwxr-xr-x 2 root root 480 Dec 2 20:54 grub lrwxrwxrwx 1 root root 23 Dec 2 20:54 initrd -> initrd-2.6.8-24-default -rw-r--r-- 1 root root 1224085 Dec 2 20:54 initrd-2.6.8-24-default -rwxr-xr-x 1 root root 2954844 Oct 6 11:31 linux-2.6.8-24-um -rw-r--r-- 1 root root 67648 Oct 2 01:20 memtest.bin -rw-r--r-- 1 root root 94720 Dec 2 20:54 message -rw-r--r-- 1 root root 79149 Oct 6 12:31 symvers-2.6.8-24-i386-default.gz -rw-r--r-- 1 root root 20 Oct 6 11:31 symvers-2.6.8-24-um-um.gz -rw-r--r-- 1 root root 1855692 Oct 6 12:30 vmlinux-2.6.8-24-default.gz lrwxrwxrwx 1 root root 24 Dec 2 20:39 vmlinuz -> vmlinuz-2.6.8-24-default -rw-r--r-- 1 root root 1556001 Oct 6 12:17 vmlinuz-2.6.8-24-default -rw-r--r-- 1 root root 30 Dec 2 20:54 device.map -rw-r--r-- 1 root root 8052 Oct 6 10:11 e2fs_stage1_5 -rw-r--r-- 1 root root 7876 Oct 6 10:11 fat_stage1_5 -rw-r--r-- 1 root root 7124 Oct 6 10:11 ffs_stage1_5 -rw-r--r-- 1 root root 7188 Oct 6 10:11 iso9660_stage1_5 -rw-r--r-- 1 root root 8576 Oct 6 10:11 jfs_stage1_5 -rw------- 1 root root 951 Dec 2 20:54 menu.lst -rw-r--r-- 1 root root 7316 Oct 6 10:11 minix_stage1_5 -rw-r--r-- 1 root root 9588 Oct 6 10:11 reiserfs_stage1_5 -rw-r--r-- 1 root root 512 Oct 6 10:11 stage1 -rw-r--r-- 1 root root 106994 Dec 2 20:55 stage2 -rw-r--r-- 1 root root 7400 Oct 6 10:11 ufs2_stage1_5 -rw-r--r-- 1 root root 6740 Oct 6 10:11 vstafs_stage1_5 -rw-r--r-- 1 root root 9404 Oct 6 10:11 xfs_stage1_5 -rw-r--r-- 1 root root 8052 Oct 6 10:11 e2fs_stage1_5 -rw-r--r-- 1 root root 7876 Oct 6 10:11 fat_stage1_5 -rw-r--r-- 1 root root 7124 Oct 6 10:11 ffs_stage1_5 -rw-r--r-- 1 root root 7188 Oct 6 10:11 iso9660_stage1_5 -rw-r--r-- 1 root root 8576 Oct 6 10:11 jfs_stage1_5 -rw-r--r-- 1 root root 7316 Oct 6 10:11 minix_stage1_5 -rw-r--r-- 1 root root 189700 Oct 6 10:11 nbgrub -rw-r--r-- 1 root root 190724 Oct 6 10:11 pxegrub -rw-r--r-- 1 root root 9588 Oct 6 10:11 reiserfs_stage1_5 -rw-r--r-- 1 root root 512 Oct 6 10:11 stage1 -rw-r--r-- 1 root root 106994 Oct 6 10:11 stage2 -rw-r--r-- 1 root root 189700 Oct 6 10:11 stage2.netboot -rw-r--r-- 1 root root 106994 Oct 6 10:11 stage2_eltorito -rw-r--r-- 1 root root 7400 Oct 6 10:11 ufs2_stage1_5 -rw-r--r-- 1 root root 6740 Oct 6 10:11 vstafs_stage1_5 -rw-r--r-- 1 root root 9404 Oct 6 10:11 xfs_stage1_5 (hd0) /dev/hda (fd0) /dev/fd0 # Modified by YaST2. Last modification on Thu Dec 2 20:54:58 2004 color white/blue black/light-gray default 0 timeout 8 gfxmenu (hd0,1)/boot/message ###Don't change this comment - YaST2 identifier: Original name: linux### title SUSE LINUX 9.2 kernel (hd0,1)/boot/vmlinuz root=/dev/hda2 vga=0x31a selinux=0 splash=silent resume=/dev/hda1 desktop elevator=as showopts initrd (hd0,1)/boot/initrd ###Don't change this comment - YaST2 identifier: Original name: floppy### title Floppy root (fd0) chainloader +1 ###Don't change this comment - YaST2 identifier: Original name: failsafe### title Failsafe -- SUSE LINUX 9.2 kernel (hd0,1)/boot/vmlinuz root=/dev/hda2 showopts ide=nodma apm=off acpi=off vga=normal noresume selinux=0 barrier=off nosmp noapic maxcpus=0 3 initrd (hd0,1)/boot/initrd ###Don't change this comment - YaST2 identifier: Original name: memtest86### title Memory Test kernel (hd0,1)/boot/memtest.bin /dev/hda2 / reiserfs acl,user_xattr 1 1 /dev/hda1 swap swap pri=42 0 0 devpts /dev/pts devpts mode=0620,gid=5 0 0 proc /proc proc defaults 0 0 usbfs /proc/bus/usb usbfs noauto 0 0 sysfs /sys sysfs noauto 0 0 /dev/cdrom /media/cdrom subfs fs=cdfss,ro,procuid,nosuid,nodev,exec,iocharset=utf8 0 0 /dev/fd0 /media/floppy subfs fs=floppyfss,procuid,nodev,nosuid,sync 0 0
[8<]
Thanks Darryl,
Here are everything you asked for, in the order you listed above. I myself suspect initrd, I have a Compaq Smart Array 2DH installed, but no drives attached to it (yet) and the system is installed onto the IDE drive. It's just a feeling, I'll pull the Compaq controller and reinstall to see what happens.
Which didn't do any good at all. Same problem. I ran PowerQuest's Partition magic on the disk, it complains that it can't read the partition table, so perhaps the drive is beyond life itself, I'll try another.
Anders Norrbring wrote:
[8<]
Thanks Darryl,
Here are everything you asked for, in the order you listed above. I myself suspect initrd, I have a Compaq Smart Array 2DH installed, but no drives attached to it (yet) and the system is installed onto the IDE drive. It's just a feeling, I'll pull the Compaq controller and reinstall to see what happens.
Which didn't do any good at all. Same problem. I ran PowerQuest's Partition magic on the disk, it complains that it can't read the partition table, so perhaps the drive is beyond life itself, I'll try another.
No need to cc: me on your posts; I have a filter set up to send email with a subject containing [SLE] into my folder for this list, so I just get two copies of your emails there :) Boot into the rescue system on the install media and run: "fdisk -l /dev/hda". If it really is a partition table problem, you'll get the same error. I will be very surprised if you do, because you have already mounted /dev/hda2 and read data from it, when you got the information I requested. All that information does look OK, but that is based on just a quick look at it. I'll look at it again later today; meanwhile, probably you'll get a few suggestions of things to try. "mkinitrd" is certainly going to be one of them.
Anders Norrbring wrote:
can't read the partition table, so perhaps the drive is beyond life itself, I'll try another.
You can boot to a rescue system, and run this: badblocks -svnb 512 /dev/hda This will perform a non-destructive read/write test with random data on the drive, one sector at a time. If there is anything wrong with the MBR or partition table, you should see IO errors in the first 63 sectors of the drive. Also run the same command, specifying /dev/hda2 (your root partition) as the target device, to see if there are any defects in it. When performing the tests, you should ensure as few other tasks as possible are using system resources. If you want a very rigorous test of the drive, you can also specify "-t 0xaa -t 0x55 -t 0xff -t 0x00" on the command line. That will perform the test 4 times, once for each of these test patterns. Using all of these test patterns is supposed to be the only really sure way of finding marginal sectors (or so I have been told), but be warned that, on a very large drive, just running the test once will take quite some time. Ctrl-C whenever you tire of letting it run :) If you want a written record of any bad sectors found, use the "-o output-file" option.
Anders Norrbring wrote:
Any ideas on where and how I should go about to fix a boot problem? When starting up, the only thing I get in a fresh SuSE 9.2 install is "GRUB stage 2", and there it hangs. Every time.
Anders, would that drive be > 120GB? If it is, are you using a 40 or 80 wire IDE ribbon with it? (You must use the 80 wire ribbon for drives > 120 GB)
Anders Norrbring wrote:
Any ideas on where and how I should go about to fix a boot problem? When starting up, the only thing I get in a fresh SuSE 9.2 install is "GRUB stage 2", and there it hangs. Every time.
Anders, would that drive be > 120GB? If it is, are you using a 40 or 80 wire IDE ribbon with it? (You must use the 80 wire ribbon for drives > 120 GB)
*smile* Technically speaking, the 40/80 wire ribbon doesn't relate to drive capacity but to bus speed. For ATA33/66 you usually run 40 wires, but for ATA100/133 you need the 80-wire. And no, that drive is an old ATA/33 drive. The problem doesn't exist on another motherboard/CPU combination, so I guess there's something wrong with the ATA controllers or BIOS on the first machine. Every other piece of hardware is the same, video, SCSI Smart Array, disks, floppy, CD/DVD and cables. I'll just toss the faulty stuff into the dumpster and go on with my life... Thanks for caring! Anders.
On Monday 06 December 2004 10:23, Anders Norrbring wrote:
And no, that drive is an old ATA/33 drive. The problem doesn't exist on another motherboard/CPU combination, so I guess there's something wrong with the ATA controllers or BIOS on the first machine. Every other piece of hardware is the same, video, SCSI Smart Array, disks, floppy, CD/DVD and cables.
Anders, just a thought, you mentioned it being an old drive - is it an old machine too? I've had some mighty weird stuff happen with IDE drives on the Pentium-I class SiS motherboard chipsets. -- Kind regards Hans du Plooy Newington Consulting Services hansdp at newingtoncs dot co dot za
On Monday 06 December 2004 10:23, Anders Norrbring wrote:
And no, that drive is an old ATA/33 drive. The problem doesn't exist on another motherboard/CPU combination, so I guess there's something wrong with the ATA controllers or BIOS on the first machine. Every other piece of hardware is the same, video, SCSI Smart Array, disks, floppy, CD/DVD and cables.
Anders, just a thought, you mentioned it being an old drive - is it an old machine too? I've had some mighty weird stuff happen with IDE drives on the Pentium-I class SiS motherboard chipsets.
As a matter of fact, Hans, it is.. It's an old Compaq Presario with Intel 440BX chipset and a Pentium-III 500MHz CPU, 512MB memory. The weird thing is that I used this same machine while beta testing 9.2 and never saw this problem. I don't know how SuSE compiles their betas, but a factor could be optimized code in the released version... Anyway, it's just a lab machine, I have others... :) Anders.
As a matter of fact, Hans, it is.. It's an old Compaq Presario with Intel 440BX chipset and a Pentium-III 500MHz CPU, 512MB memory. Your idea of "old" and mine are worlds apart! :-) My workstation is something
On Monday 06 December 2004 10:40, Anders Norrbring wrote: like this - "test" machines tend to be early P-IIs. A company close by sell "refurbished ex-corporate" desktops. They're mostly Pentium II, but good machines - compaq desk pros and similar level HP machines. Often they have dual graphics cards, sometimes dual cpus, and mostly SCSI discs. I find another problem from time to time. My work machine has two SCSI discs and one IDE. SUSE is installed on the two SCSIs, the IDE is my /home. In 9.1 I had to remove the IDE disc to be able to install. In 9.2 I was able to install, but after running doing the kernel update in YOU, grub can't find anything. I just says something to the effect of "can't find (hd0,0)/boot/messages -- Kind regards Hans du Plooy Newington Consulting Services hansdp at newingtoncs dot co dot za
On Monday 06 December 2004 10:40, Anders Norrbring wrote:
As a matter of fact, Hans, it is.. It's an old Compaq Presario with Intel 440BX chipset and a Pentium-III 500MHz CPU, 512MB memory.
Your idea of "old" and mine are worlds apart! :-) My workstation is something like this - "test" machines tend to be early P-IIs. A company close by sell "refurbished ex-corporate" desktops. They're mostly Pentium II, but good machines - compaq desk pros and similar level HP machines. Often they have dual graphics cards, sometimes dual cpus, and mostly SCSI discs.
My "oldest" production platform is an Athlon XP 2000+, older than that are classified as test boxes.. ;)
I find another problem from time to time. My work machine has two SCSI discs and one IDE. SUSE is installed on the two SCSIs, the IDE is my /home. In 9.1 I had to remove the IDE disc to be able to install. In 9.2 I was able to install, but after running doing the kernel update in YOU, grub can't find anything. I just says something to the effect of "can't find (hd0,0)/boot/messages
I've had similar issues from time to time. Installing on SCSI and adding IDE renders the system either unstable or unbootable. After drinking several gallons of single malt whiskey I came up with a solution that's worked for me, it's usually in the system BIOS boot order you need to look. Most systems will allow you to set for example 1-Floppy, 2-CDROM, 3-Hard drive. Not enough! You'll also need to go into the BIOS section where the "Hard drives" are specified, every default BIOS setting I've seen puts all IDE hard disks as the first listed, then the BIOS lists "secondary stuff" like "Bootable add-in card" (Yep, that's your SCSI), I've simply changed the internal hard drive order to be SCSI (Bootable add-in card) as the first hard drive in the boot order, then the IDE drives. Worked for me... Anders.
On Monday 06 December 2004 16:22, Anders Norrbring wrote:
YOU, grub can't find anything. I just says something to the effect of "can't find (hd0,0)/boot/messages
Most systems will allow you to set for example 1-Floppy, 2-CDROM, 3-Hard drive. Yeah, mine is on SCSI. It works peachy, until *after* doing Yast updates. This happened to me on a Dell poweredge server with SATA raid-5 in too. Worked great until I tried updating the kernel - then everything broke
-- Kind regards Hans du Plooy Newington Consulting Services hansdp at newingtoncs dot co dot za
Hans du Plooy wrote:
On Monday 06 December 2004 10:40, Anders Norrbring wrote:
As a matter of fact, Hans, it is.. It's an old Compaq Presario with Intel 440BX chipset and a Pentium-III 500MHz CPU, 512MB memory.
Your idea of "old" and mine are worlds apart! :-) My workstation is something like this - "test" machines tend to be early P-IIs.
Mine too -- I have a K6-2/400, and that's it. :) Switched from a Pentium 100 when I installed 9.0; same main board, though, of the same vintage as Anders's Presario :).
install, but after running doing the kernel update in YOU, grub can't find anything. I just says something to the effect of "can't find (hd0,0)/boot/messages
I read in grub.info that sometimes grub doesn't get the drive mappings right. The particular situation that is mentioned is yours, where the drive order is remapped in the BIOS to allow booting from a SCSI. So, you might try mounting the SCSI you boot from in the rescue system, and checking <mountpoint>/boot/grub/device.map to see if the boot SCSI is still listed as being mapped to (hd0). If not, just edit device.map so that it is; I imagine (but please do not quote me :) ) that the second SCSI *should* be (hd1), and the IDE (hd2). The grub stuff in the MBR should not have been changed when you updated the kernel, but if the device map is the problem, I would suggest doing a grub-install with the corrected device.map anyway. Easiest route seems to be chroot <mountpoint> followed by a simple "grub-install". If /boot is installed on its own partition, I do believe that you need to use the "root-directory" option, eg if you did a chroot, then "grub-install --root-directory=/boot" (hd0)"
participants (4)
-
Anders Norrbring
-
Darryl Gregorash
-
Hans du Plooy
-
Stan Glasoe