[opensuse-factory] Grub broken after update
Everyone, I have a bit of a problem after 20180206 snapshot on a Tumbleweed server installation. First, I haven't been able to upgrade to any of the 4.14 kernels for some reason (I've had text color corruption on my tty and very common kernel panics, this is on a server installation so no display manager). So, I can't that it was this specific snapshot that caused the problem. I'm just looking for help to get my system back to a working state. Since I skipped the entire set of updates with a 4.14 kernel, it has been a couple of months since I've updated. I did a zypper dup to 20180206 since the kernel was now 4.15.1 and I wanted to see if the new kernel had fixed my problem. After the zypper dup, I did a rpmconfigcheck to see what config files needed attention, and there were a number that seemed innocuous (samba and the like), but there was also an entry for "/etc/default/grub.rpmnew". I looked briefly at the diff between /etc/default/grub.rpmnew and the old grub, and I hadn't really customized it outside of some kernel parameters given through yast, so I replaced the old grub with the grub.rpmnew (renaming the old grub to grub.old), and I used yast to set the kernel parameters back to what I had them as. Then I restarted the computer. The computer boots to the grub menu, but no matter if I choose to boot the current kernel or the old kernel, I get the following error: Loading Linux 4.15.1-1-default error: can't find command `linux'. Loading initial ramdisk... error: can't find command `initrd'. Press any key to continue... Then I'm spit back to the grub menu. Moreover, there is no option to boot a snapshot, so I can't rollback (which I would consider to be a fine solution to my problem). Something has gone wrong with grub, and I'm not sure how to fix it. I use BTRFS, and I use the standard filesystem layout (the old one, I've seen that there is a new layout in more recent snapshots). I can make a rescue cd from the most recent snapshot, but I'm not certain exactly how I would manually mount the btrfs filesystem in order to chroot into it correctly. I'm also not certain what I should do after chrooting into the old filesystem. Any help on this would be greatly appreciated. Also, any insight on what I did wrong in the update would be valuable information for the future. Thanks in advance! Michael -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 2018-02-09 16:36, Michael Albert wrote:
Everyone,
I have a bit of a problem after 20180206 snapshot on a Tumbleweed server installation. First, I haven't been able to upgrade to any of the 4.14 kernels for some reason (I've had text color corruption on my tty and very common kernel panics, this is on a server installation so no display manager). So, I can't that it was this specific snapshot that caused the problem. I'm just looking for help to get my system back to a working state.
Since I skipped the entire set of updates with a 4.14 kernel, it has been a couple of months since I've updated. I did a zypper dup to 20180206 since the kernel was now 4.15.1 and I wanted to see if the new kernel had fixed my problem. After the zypper dup, I did a rpmconfigcheck to see what config files needed attention, and there were a number that seemed innocuous (samba and the like), but there was also an entry for "/etc/default/grub.rpmnew".
I looked briefly at the diff between /etc/default/grub.rpmnew and the old grub, and I hadn't really customized it outside of some kernel parameters given through yast, so I replaced the old grub with the grub.rpmnew (renaming the old grub to grub.old), and I used yast to set the kernel parameters back to what I had them as.
This will not help now, but next time. I do a backup of the "new" config files: md /root/Upgrades/{VERSION} cd /root/Upgrades/{VERSION} cat /var/adm/rpmconfigcheck | xargs -I '{}' -n 1 cp --parents '{}' . Then I use meld on each pair of files: meld /etc/default/grub /etc/default/grub.rpmnew "Meld is a visual diff and merge tool. You can compare two or three files and edit them in place (diffs update dynamically). You can compare two or three folders and launch file comparisons. You can browse and view a working copy from popular version control systems such such as CVS, Subversion, Bazaar-ng and Mercurial." If meld is not installed, do it :-) To see what is different, what new and I want to use, or what I'll ignore, or to edit my own things, it is a full editor. In fact, this time I used this new script: #!/bin/bash while read FILES ; do echo "Before:" ls -l ${FILES%.*} ${FILES} meld ${FILES%.*} ${FILES} echo "After:" ls -l ${FILES%.*} ${FILES} echo Press enter for next or control-C read done < /var/adm/rpmconfigcheck (the press ^C doesn't work) Finally, I deleted the .rpm files: cat /var/adm/rpmconfigcheck | xargs rm
Then I restarted the computer. The computer boots to the grub menu, but no matter if I choose to boot the current kernel or the old kernel, I get the following error:
Loading Linux 4.15.1-1-default error: can't find command `linux'. Loading initial ramdisk... error: can't find command `initrd'.
Press any key to continue...
Then I'm spit back to the grub menu. Moreover, there is no option to boot a snapshot, so I can't rollback (which I would consider to be a fine solution to my problem).
This is the standard procedure I'll describe - but note that I don't use btrfs, I don't if you have to do something special with it. Boot the machine with any Linux system on CD or USB stick you have, or from some other partition. Mount your failed system in, for instance, /mnt. Then do: mount --bind /dev /mnt/dev mount --bind /proc /mnt/proc mount --bind /sys /mnt/sys chroot /mnt At this point, you have your "failed" system available in that terminal, and you can run commands. You can now edit /etc/default/grub and repair it. Once done, there is a comment at the top of the file that says what command you have to run to apply the changes. You need a text mode editor; I can suggest joe (aka jstar, jmacs, & jpico), or mcedit, or vi - but in the later case you do not need my editor advice ;-) You can also run "yast" in text mode in that window, and change the bootloader configuration. Trick: to make YaST write to disk if there are no changes, I simply change the number of seconds Grub is told to wait before booting an entry. Say it is 8, I write 9. Once done, type "exit" or ^D on the terminal to exit the chroot. Caveat: I have done all of the above several times, it works; but never in Tumbleweed, which I don't use on real hardware. And never on btrfs, which I don't use. I see no reason why it would not work, but just take that into consideration :-) -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Thanks for the response. I've done the standard chroot thing before to fix boot issues on other file systems, but the problem with btrfs is that there are a bunch of different subvolumes for different folders that are a part of root. The issue I'm having is that I don't know what needs to be mounted and what doesn't in order to repair grub. Nor am I certain what the right way to repair grub is in this particular situation; this is a brand new kind of problem for me and I don't know how it happened. Is it just something like: grub2-mkconfig -o /boot/grub2/grub.cfg; grub2-install /dev/sda; ? Do I need to do a dracut -f afterwards? I've got all of the old config files (I save this in a slightly more labor intensive manner than you, so I do save them). I can restore the old /etc/default/grub file and then try to repair grub, I just don't know what the right procedure is. I also use diff to compare changes, it's maybe a little more work than meld, but the idea is the same. Michael On Fri, Feb 9, 2018 at 1:15 PM, Carlos E. R. <robin.listas@telefonica.net> wrote:
On 2018-02-09 16:36, Michael Albert wrote:
Everyone,
I have a bit of a problem after 20180206 snapshot on a Tumbleweed server installation. First, I haven't been able to upgrade to any of the 4.14 kernels for some reason (I've had text color corruption on my tty and very common kernel panics, this is on a server installation so no display manager). So, I can't that it was this specific snapshot that caused the problem. I'm just looking for help to get my system back to a working state.
Since I skipped the entire set of updates with a 4.14 kernel, it has been a couple of months since I've updated. I did a zypper dup to 20180206 since the kernel was now 4.15.1 and I wanted to see if the new kernel had fixed my problem. After the zypper dup, I did a rpmconfigcheck to see what config files needed attention, and there were a number that seemed innocuous (samba and the like), but there was also an entry for "/etc/default/grub.rpmnew".
I looked briefly at the diff between /etc/default/grub.rpmnew and the old grub, and I hadn't really customized it outside of some kernel parameters given through yast, so I replaced the old grub with the grub.rpmnew (renaming the old grub to grub.old), and I used yast to set the kernel parameters back to what I had them as.
This will not help now, but next time.
I do a backup of the "new" config files:
md /root/Upgrades/{VERSION} cd /root/Upgrades/{VERSION} cat /var/adm/rpmconfigcheck | xargs -I '{}' -n 1 cp --parents '{}' .
Then I use meld on each pair of files:
meld /etc/default/grub /etc/default/grub.rpmnew
"Meld is a visual diff and merge tool. You can compare two or three files and edit them in place (diffs update dynamically). You can compare two or three folders and launch file comparisons. You can browse and view a working copy from popular version control systems such such as CVS, Subversion, Bazaar-ng and Mercurial."
If meld is not installed, do it :-)
To see what is different, what new and I want to use, or what I'll ignore, or to edit my own things, it is a full editor. In fact, this time I used this new script:
#!/bin/bash while read FILES ; do
echo "Before:" ls -l ${FILES%.*} ${FILES} meld ${FILES%.*} ${FILES} echo "After:" ls -l ${FILES%.*} ${FILES} echo Press enter for next or control-C read done < /var/adm/rpmconfigcheck
(the press ^C doesn't work)
Finally, I deleted the .rpm files:
cat /var/adm/rpmconfigcheck | xargs rm
Then I restarted the computer. The computer boots to the grub menu, but no matter if I choose to boot the current kernel or the old kernel, I get the following error:
Loading Linux 4.15.1-1-default error: can't find command `linux'. Loading initial ramdisk... error: can't find command `initrd'.
Press any key to continue...
Then I'm spit back to the grub menu. Moreover, there is no option to boot a snapshot, so I can't rollback (which I would consider to be a fine solution to my problem).
This is the standard procedure I'll describe - but note that I don't use btrfs, I don't if you have to do something special with it.
Boot the machine with any Linux system on CD or USB stick you have, or from some other partition. Mount your failed system in, for instance, /mnt.
Then do:
mount --bind /dev /mnt/dev mount --bind /proc /mnt/proc mount --bind /sys /mnt/sys
chroot /mnt
At this point, you have your "failed" system available in that terminal, and you can run commands. You can now edit /etc/default/grub and repair it. Once done, there is a comment at the top of the file that says what command you have to run to apply the changes.
You need a text mode editor; I can suggest joe (aka jstar, jmacs, & jpico), or mcedit, or vi - but in the later case you do not need my editor advice ;-)
You can also run "yast" in text mode in that window, and change the bootloader configuration. Trick: to make YaST write to disk if there are no changes, I simply change the number of seconds Grub is told to wait before booting an entry. Say it is 8, I write 9.
Once done, type "exit" or ^D on the terminal to exit the chroot.
Caveat: I have done all of the above several times, it works; but never in Tumbleweed, which I don't use on real hardware. And never on btrfs, which I don't use. I see no reason why it would not work, but just take that into consideration :-)
-- Cheers / Saludos,
Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 2018-02-09 19:28, Michael Albert wrote:
Thanks for the response. I've done the standard chroot thing before to fix boot issues on other file systems, but the problem with btrfs is that there are a bunch of different subvolumes for different folders that are a part of root.
I see. :-(
The issue I'm having is that I don't know what needs to be mounted and what doesn't in order to repair grub. Nor am I certain what the right way to repair grub is in this particular situation; this is a brand new kind of problem for me and I don't know how it happened. Is it just something like:
grub2-mkconfig -o /boot/grub2/grub.cfg; grub2-install /dev/sda;
?
Do I need to do a dracut -f afterwards?
I think you only need the first one, but on doubt, just use yast bootloader module and change the timeout one second to force it write it all.
I've got all of the old config files (I save this in a slightly more labor intensive manner than you, so I do save them). I can restore the old /etc/default/grub file and then try to repair grub, I just don't know what the right procedure is.
I would just restore the old one, compare carefully with the new one to see if there really is something to add, then do the trick with yast. But listen first to what Andrei suggests, he is the expert ;-)
I also use diff to compare changes, it's maybe a little more work than meld, but the idea is the same.
Yes, just more visual :-) -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
This ended up being easier than I thought. I booted into a rescue cd, and I just mounted the partition with my root filesystem on it directly using "mount /dev/sdb3 /mnt", since the current snapshot is the default subvolume, the root filesystem mounted. I then used the standard way to bind everything that needed binding, and then after doing the chroot, I just ran a "mount -a", and since I had already done the chroot, the mount -a used the fstab and the correct mount points. I then just reverted to the old /etc/default/grub config file and then used grub2-mkconfig and grub2-install and for good measure, I also did a dracut -f, but I'm not sure it was necessary. I still don't know the root of the issue, but this seems to have resolved things to my satisfaction. I've attached a copy of the /etc/default/grub.rpmnew file that I got after the update. I'm sure the error lies there, but I don't know what it is. Note the only thing that changed from what I got after the zypper dup is GRUB_CMDLINE_LINUX_DEFAULT="resume=/dev/sda2 splash=silent quiet showopts intel_iommu=on iommu=pt" which I added myself through yast. If anyone who knows what they are looking at and wants to figure out what borked the installation, feel free to let me know. I'm satisfied with just getting things working again. Thanks everyone who took time to respond. Michael On Fri, Feb 9, 2018 at 3:16 PM, Carlos E. R. <robin.listas@telefonica.net> wrote:
On 2018-02-09 19:28, Michael Albert wrote:
Thanks for the response. I've done the standard chroot thing before to fix boot issues on other file systems, but the problem with btrfs is that there are a bunch of different subvolumes for different folders that are a part of root.
I see. :-(
The issue I'm having is that I don't know what needs to be mounted and what doesn't in order to repair grub. Nor am I certain what the right way to repair grub is in this particular situation; this is a brand new kind of problem for me and I don't know how it happened. Is it just something like:
grub2-mkconfig -o /boot/grub2/grub.cfg; grub2-install /dev/sda;
?
Do I need to do a dracut -f afterwards?
I think you only need the first one, but on doubt, just use yast bootloader module and change the timeout one second to force it write it all.
I've got all of the old config files (I save this in a slightly more labor intensive manner than you, so I do save them). I can restore the old /etc/default/grub file and then try to repair grub, I just don't know what the right procedure is.
I would just restore the old one, compare carefully with the new one to see if there really is something to add, then do the trick with yast.
But listen first to what Andrei suggests, he is the expert ;-)
I also use diff to compare changes, it's maybe a little more work than meld, but the idea is the same.
Yes, just more visual :-)
-- Cheers / Saludos,
Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
09.02.2018 18:36, Michael Albert пишет:
Everyone,
I have a bit of a problem after 20180206 snapshot on a Tumbleweed server installation. First, I haven't been able to upgrade to any of the 4.14 kernels for some reason (I've had text color corruption on my tty and very common kernel panics, this is on a server installation so no display manager). So, I can't that it was this specific snapshot that caused the problem. I'm just looking for help to get my system back to a working state.
Since I skipped the entire set of updates with a 4.14 kernel, it has been a couple of months since I've updated. I did a zypper dup to 20180206 since the kernel was now 4.15.1 and I wanted to see if the new kernel had fixed my problem. After the zypper dup, I did a rpmconfigcheck to see what config files needed attention, and there were a number that seemed innocuous (samba and the like), but there was also an entry for "/etc/default/grub.rpmnew".
I looked briefly at the diff between /etc/default/grub.rpmnew and the old grub, and I hadn't really customized it outside of some kernel parameters given through yast, so I replaced the old grub with the grub.rpmnew (renaming the old grub to grub.old), and I used yast to set the kernel parameters back to what I had them as.
Then I restarted the computer. The computer boots to the grub menu, but no matter if I choose to boot the current kernel or the old kernel, I get the following error:
Loading Linux 4.15.1-1-default error: can't find command `linux'. Loading initial ramdisk... error: can't find command `initrd'.
Press any key to continue...
It sounds like you have subvolume structure intended for snapshots but /etc/default/grub lacks SUSE_BTRFS_SNAPSHOT_BOOTING=true. This variable changes how grub2 works with btrfs, and it probably simply does not find /boot/grub2/x86_64-xxx/ directory (where "xxx" is for "pc" or "efi"). Can you show current /etc/default/grub and "btrfs sub li /path/to/root" output?
Then I'm spit back to the grub menu. Moreover, there is no option to boot a snapshot, so I can't rollback (which I would consider to be a fine solution to my problem).
Something has gone wrong with grub, and I'm not sure how to fix it. I use BTRFS, and I use the standard filesystem layout (the old one, I've seen that there is a new layout in more recent snapshots). I can make a rescue cd from the most recent snapshot, but I'm not certain exactly how I would manually mount the btrfs filesystem in order to chroot into it correctly. I'm also not certain what I should do after chrooting into the old filesystem.
Any help on this would be greatly appreciated. Also, any insight on what I did wrong in the update would be valuable information for the future.
Thanks in advance!
Michael
-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Yeah, I'm using the subvolume structure intended for snapshots, and in the past I've had no problem rolling back snapshots (I kept having to do it since no 4.14 kernel was stable for me and I kept trying to update). However, I don't have an easy way to show the current /etc/default/grub and the output of "btrfs sub li /path/to/root" because I can't boot the system. I can get the system running with a recovery disk, so I may be able to get the output, but I will have to wait until tonight. Do you think it's likely to work if I just restore my old /etc/default/grub and then do: grub2-mkconfig -o /boot/grub2/grub.cfg; grub2-install /dev/sda; after properly setting up the root filesystem and then doing a chroot? I used the standard btrfs subvolume layout for TW when I installed the system (a little over a year ago), so is there any documented way to mount the subvolumes in order to chroot into a properly structured root filesystem? Michael On Fri, Feb 9, 2018 at 2:07 PM, Andrei Borzenkov <arvidjaar@gmail.com> wrote:
09.02.2018 18:36, Michael Albert пишет:
Everyone,
I have a bit of a problem after 20180206 snapshot on a Tumbleweed server installation. First, I haven't been able to upgrade to any of the 4.14 kernels for some reason (I've had text color corruption on my tty and very common kernel panics, this is on a server installation so no display manager). So, I can't that it was this specific snapshot that caused the problem. I'm just looking for help to get my system back to a working state.
Since I skipped the entire set of updates with a 4.14 kernel, it has been a couple of months since I've updated. I did a zypper dup to 20180206 since the kernel was now 4.15.1 and I wanted to see if the new kernel had fixed my problem. After the zypper dup, I did a rpmconfigcheck to see what config files needed attention, and there were a number that seemed innocuous (samba and the like), but there was also an entry for "/etc/default/grub.rpmnew".
I looked briefly at the diff between /etc/default/grub.rpmnew and the old grub, and I hadn't really customized it outside of some kernel parameters given through yast, so I replaced the old grub with the grub.rpmnew (renaming the old grub to grub.old), and I used yast to set the kernel parameters back to what I had them as.
Then I restarted the computer. The computer boots to the grub menu, but no matter if I choose to boot the current kernel or the old kernel, I get the following error:
Loading Linux 4.15.1-1-default error: can't find command `linux'. Loading initial ramdisk... error: can't find command `initrd'.
Press any key to continue...
It sounds like you have subvolume structure intended for snapshots but /etc/default/grub lacks SUSE_BTRFS_SNAPSHOT_BOOTING=true. This variable changes how grub2 works with btrfs, and it probably simply does not find /boot/grub2/x86_64-xxx/ directory (where "xxx" is for "pc" or "efi").
Can you show current /etc/default/grub and "btrfs sub li /path/to/root" output?
Then I'm spit back to the grub menu. Moreover, there is no option to boot a snapshot, so I can't rollback (which I would consider to be a fine solution to my problem).
Something has gone wrong with grub, and I'm not sure how to fix it. I use BTRFS, and I use the standard filesystem layout (the old one, I've seen that there is a new layout in more recent snapshots). I can make a rescue cd from the most recent snapshot, but I'm not certain exactly how I would manually mount the btrfs filesystem in order to chroot into it correctly. I'm also not certain what I should do after chrooting into the old filesystem.
Any help on this would be greatly appreciated. Also, any insight on what I did wrong in the update would be valuable information for the future.
Thanks in advance!
Michael
-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
i once had a situation and instead of chroot I used a program (if i remember) supergrubdisk? to allow me to boot the actual system. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
participants (4)
-
Andrei Borzenkov
-
Carlos E. R.
-
Michael Albert
-
nicholas cunliffe