ReiserFS, moving hard drives, LILO and the meaning of life
Background situation: Running SuSE 9.0. SuSE is installed on /dev/hdb1, formatted ReiserFS. Booting via LILO installed on the MBR of /dev/hda1. /dev/hda1 was formerly Mandrake. Due to a number of issues which are irrelevant to this discussion, I decided to switch from Mandrake to SuSE. I added the second hard drive, installed SuSE, got it booting and functioning, then copied all of my email archives and other documents from the Mandrake drive to SuSE. That was several months ago and went without a hitch. Friday, I did an online update of SuSE. One of the updates was to the kernel. (Not sure when the update came out, as I hadn't done an update in awhile.) The kernel package failed to install. The error details were singularly unhelpful: something along the lines of "Package failed." Shortly after that, I rebooted my computer and got 99 99 99 99.... I'm assuming that it was the failed kernel update that caused it but I'm not positive, of course. Since I was going to have to work on the system anyway, I decided to go ahead and pull /dev/hda out of the system (as I wasn't using the space and could use it in another system) and shift /dev/hdb to /dev/hda. After removing the hard drive, I first tried to boot from a grub floppy. After booting into grub, I did root (hd0) and got an error "filesystem type unknown, using whole disk." That didn't bode well but I tried booting via the kernel command, and got an error "cannot mount selected partition." I assumed grub doesn't support ReiserFS and broke out the SuSE boot CD. Booted into rescue. Mounted /dev/hda1 under /mnt, edited /mnt/etc/lino.conf and changed all references to /dev/hdb to /dev/hda, then did chroot /mnt /sbin/lilo Lilo happily replied that it was adding Linux and failsafe. I rebooted and the system started through the boot process, then bombed to the emergency repair command line complaining that fsck on the drive failed. I tried running reiserfsck --check /dev/hda1. The program asked if I wanted to run it and I said "Yes." Capital 'Y', lowercase 'e', lowercase 's'. I got no error, just an instant return to the command prompt, as if I'd said No. Tried again with --rebuild-sb and even --rebuild-tree. Same thing. Tried rebooting again and got the same thing. After a bit of mucking about, I threw up my hands, shut down the system and put the old hard drive back in as /dev/hda and moved the SuSE drive back to /dev/hdb. Rebooted into rescue mode from the CD, restored lilo.conf and did a chrooted lilo again. Rebooted and the system started back up normally. OK, at least I have my system back (although all this took much longer to do than it's taken to tell.) But I don't like giving up and I don't see a need to keep an extra hard drive in my system just for the MBR. hda is a 13 gig drive, and hdb is a 27gig drive. My entire Linux setup uses less than 7 gig. So why not transfer everything over to hda, and free up the 27gig drive? With the ReiserFS weirdness, however, I decided to switch to ext3. I created a new ext3 filesystem on hda. Now, how do I get my system over? You can't use a simple cp because hda would be mounted as part of the file tree. Copying the entire file tree to a disk that's mounted as part of the same tree takes a LONG time. (Recursion, anyone?) So I made sure hda was not mounted, unmounted the network drives, and tarred * from the root. When that was done, I mounted hda under /mnt, moved the tar ball over and untarred it. I wasn't sure if that would work, but what the heck, that's how you learn stuff. I edited /mnt/etc/lilo.conf to change all references from hdb to hda, and chrooted lilo to /mnt. Lilo happily did it's thing and I rebooted. The system came up. Dropped to a command line and issued mount. Mount told me that /dev/hdb1 was mounted on / as ReiserFS. Huh? The system should have booted with /dev/hda as root. On a hunch, I did a df. /dev/hdb1 size 13G, used 6G. hdb is a 27 gig drive; dba is 13G. Tried to mount /dev/hda1 to /mnt. Nope, device is busy or already mounted. OK, mount /dev/hdb1 on /mnt. No problem. Mount then tells me I have /dev/hdb1 mounted as the root fs and on /mnt, both as ReiserFS. I go into each one. Each seems to be functioning fine. I cat a couple of files, then touch a new file. No problems and the files on one system do not affect the files on the other. lilo.conf in /mnt/etc is still configured for hdb, so I did yet another chrooted lilo and rebooted. Got my normal system running on hdb. Mounted /dev/hda1 and it mounts fine. As ext3. I know this is rather long but its actually a compressed version of what I've spent my weekend doing. It seems to me that ReiserFS somehow knows which device a drive is mounted as, and doesn't particularly care for it if you shift the drive about. However, that doesn't explain why configuring the system to boot /dev/hda as root fs ends up with a schizophrenic system that mounts an ext3 formatted /dev/hda1 as a ReiserFS formatted /dev/hdb1. If anyone can shed any light on what the heck is going on, or has any ideas how I can get this system down to one drive, I'd like to hear it. About the only idea I have left is to leave the drive as hdb, tell lilo to install to it, and configure the bios to boot from hd1 vice hd0. I think that would work but it seems too much like admitting defeat. A lone drive should be /dev/hda not hdb! Anyway, if you made it through all of this thanks for reading and thanks in advance for any thoughts.
The Monday 2004-04-05 at 15:46 -0400, Lists wrote:
Tried rebooting again and got the same thing. After a bit of mucking about, I threw up my hands, shut down the system and put the old hard drive back in as /dev/hda and moved the SuSE drive back to /dev/hdb. Rebooted into rescue mode from the CD, restored lilo.conf and did a chrooted lilo again. Rebooted and the system started back up normally.
That's what you should have done first thing; what failed after the update is that you did not run lilo after it.
With the ReiserFS weirdness, however, I decided to switch to ext3. I created a new ext3 filesystem on hda. Now, how do I get my system over? You can't use a simple cp because hda would be mounted as part of the file tree. Copying the entire file tree to a disk that's mounted as part of the same tree takes a LONG time. (Recursion, anyone?) So I made sure hda was not mounted, unmounted the network drives, and tarred * from the root. When that was done, I mounted hda under /mnt, moved the tar ball over and untarred it. I wasn't sure if that would work, but what the heck, that's how you learn stuff.
Read /usr/share/doc/howto/en/mini/Hard-Disk-Upgrade.gz
I edited /mnt/etc/lilo.conf to change all references from hdb to hda, and chrooted lilo to /mnt. Lilo happily did it's thing and I rebooted. The system came up. Dropped to a command line and issued mount. Mount told me that /dev/hdb1 was mounted on / as ReiserFS. Huh? The system should have booted with /dev/hda as root. On a hunch, I did a df.
You forgot to edit /etc/fstab. When changing disks, you have to edit both lilo.conf (or grub) and fstab. That error you report is consistent with a mismatching fstab file, because that's the place the system is looking for partition types. -- Cheers, Carlos Robinson
On Tue, 2004-04-06 at 15:32, Carlos E. R. wrote:
The Monday 2004-04-05 at 15:46 -0400, Lists wrote:
Tried rebooting again and got the same thing. After a bit of mucking about, I threw up my hands, shut down the system and put the old hard drive back in as /dev/hda and moved the SuSE drive back to /dev/hdb. Rebooted into rescue mode from the CD, restored lilo.conf and did a chrooted lilo again. Rebooted and the system started back up normally.
That's what you should have done first thing; what failed after the update is that you did not run lilo after it.
I take it that the update would have run lilo if it had completed normally? The error essentially left the system in an in-between state.
With the ReiserFS weirdness, however, I decided to switch to ext3. I created a new ext3 filesystem on hda. Now, how do I get my system over? You can't use a simple cp because hda would be mounted as part of the file tree. Copying the entire file tree to a disk that's mounted as part of the same tree takes a LONG time. (Recursion, anyone?) So I made sure hda was not mounted, unmounted the network drives, and tarred * from the root. When that was done, I mounted hda under /mnt, moved the tar ball over and untarred it. I wasn't sure if that would work, but what the heck, that's how you learn stuff.
Read /usr/share/doc/howto/en/mini/Hard-Disk-Upgrade.gz
Thanks for the pointer. I'll look that up.
I edited /mnt/etc/lilo.conf to change all references from hdb to hda, and chrooted lilo to /mnt. Lilo happily did it's thing and I rebooted. The system came up. Dropped to a command line and issued mount. Mount told me that /dev/hdb1 was mounted on / as ReiserFS. Huh? The system should have booted with /dev/hda as root. On a hunch, I did a df.
You forgot to edit /etc/fstab. When changing disks, you have to edit both lilo.conf (or grub) and fstab. That error you report is consistent with a mismatching fstab file, because that's the place the system is looking for partition types.
Now that you point it out, it makes perfect sense. It never even occurred to me at the time I was pulling my hair out. Every time I start getting cocky about my l33t 5ki1z, something comes up to thoroughly humble me. Thanks for helping me figure this out. Now, I have something to play with this weekend. <G>
The Wednesday 2004-04-07 at 19:50 -0400, Lists wrote:
On Tue, 2004-04-06 at 15:32, Carlos E. R. wrote:
The Monday 2004-04-05 at 15:46 -0400, Lists wrote:
That's what you should have done first thing; what failed after the update is that you did not run lilo after it.
I take it that the update would have run lilo if it had completed normally? The error essentially left the system in an in-between state.
No, it should have put a pop up message telling you to run lilo in the case you use lilo as boot manager, or nothing if you use grub. It seems there is no sure way of knowing which one we use automatically, so they just put a reminder. Some patches ago the message was unclear, it said something like "remember to run lilo"; thus people using grub run lilo, and prety weird things happened - it was comented on the list :-)
You forgot to edit /etc/fstab. When changing disks, you have to edit both lilo.conf (or grub) and fstab. That error you report is consistent with a mismatching fstab file, because that's the place the system is looking for partition types.
Now that you point it out, it makes perfect sense. It never even occurred to me at the time I was pulling my hair out. Every time I start getting cocky about my l33t 5ki1z, something comes up to thoroughly humble me. Thanks for helping me figure this out. Now, I have something to play with this weekend. <G>
Welcome :-) Experience is just the sum of ours errors - I just happened to walk that path first ;-) -- Cheers, Carlos Robinson
participants (2)
-
Carlos E. R.
-
Lists