[Bug 591700] New: System fails to boot after "zypper up" from 11.3 M2 to M4
http://bugzilla.novell.com/show_bug.cgi?id=591700 http://bugzilla.novell.com/show_bug.cgi?id=591700#c0 Summary: System fails to boot after "zypper up" from 11.3 M2 to M4 Classification: openSUSE Product: openSUSE 11.3 Version: Milestone 4 Platform: Other OS/Version: Other Status: NEW Severity: Critical Priority: P5 - None Component: Bootloader AssignedTo: jsrain@novell.com ReportedBy: kkaempf@novell.com QAContact: jsrain@novell.com Found By: Development Blocker: --- /boot/grub/menu.lst is wrong after "zypper up" (see attachment) - It contains multiple entries (2.6.33-5-default, which is not installed) - The first entries have the wrong disk ( hd(1,0) vs hd(0,0) ) - The second entries have the wrong kernel ( 2.6.33-5 vs 2.6.3-6 ) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c1
--- Comment #1 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c2
--- Comment #2 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c
Jiri Srain
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c3
--- Comment #3 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c4
Josef Reidinger
Its one of these BIOS-disk-order vs. Kernel-disk-order problems.
The system has SATA and IDE controller, both with a disk attached. The kernel sees the IDE disk first (/dev/sda), SATA second (/dev/sdb). However, the SATA drive is the boot drive so to grub hd(0,0) is the right value.
This should not be problem, as we use udev links. Problem is that in some time hd0,0 is written instead of correct entry. I must dig logs to find where this happen (sometime between 06.03 and 28.3 - first refresh which doesn't change anything read (hd0,0)) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c5
--- Comment #5 from Klaus Kämpf
This should not be problem, as we use udev links.
Indeed. Linux is using by-id links and doesn't care about disk order. But grub has huge problems.
Problem is that in some time hd0,0 is written instead of correct entry.
In my case, hd(0,0) is the right entry since hd(0,0) == first disk in BIOS boot order. The bug is that grubs menu.lst contained hd(1,0) since the 'boot' disk shows up as sdb in Linux Actually, there are two bugs since menu.lst still contained the boot entries for the previous (Milestone 2) kernel ! -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c6
--- Comment #6 from Josef Reidinger
(In reply to comment #4)
This should not be problem, as we use udev links.
Indeed. Linux is using by-id links and doesn't care about disk order.
But grub has huge problems.
Problem is that in some time hd0,0 is written instead of correct entry.
In my case, hd(0,0) is the right entry since hd(0,0) == first disk in BIOS boot order. The bug is that grubs menu.lst contained hd(1,0) since the 'boot' disk shows up as sdb in Linux
Actually, there are two bugs since menu.lst still contained the boot entries for the previous (Milestone 2) kernel !
Boot order is stored in /boot/grub/device.map so if you change boot order in BIOS you must regenerate whole boot configuration. Otherwise yast2-bootloader should store correct order and I use this order. It is not two bugs just two effects of one bug :) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c7
--- Comment #7 from Klaus Kämpf
Boot order is stored in /boot/grub/device.map
Do you need this file for debugging ?
so if you change boot order in BIOS
I didn't change anything in the BIOS. The system booted fine with M2.
It is not two bugs just two effects of one bug :)
I see two bugs here 1. Wrong disk choosen (hd(1,0)) 2. Invalid boot entry (pointing to old kernel image) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c8
--- Comment #8 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c9
--- Comment #9 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c10
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c11
--- Comment #11 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c12
--- Comment #12 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c13
--- Comment #13 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c14
--- Comment #14 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c15
Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c16
Josef Reidinger
Created an attachment (id=355085) --> (http://bugzilla.novell.com/attachment.cgi?id=355085) [details] /var/log/YaST2 (wrong partition)
This is /var/log/YaST2 of the system with wrong partition in the bootloader
Hi, in this case it looks like udev is totally broken after kernel update and it cause problem with partition because it cannot translate partition information. Please provide output of these script to ensure if problem happen only after kernel update or it is still present: udevadm info -q all -n /dev/dm-0 udevadm info -q all -n /dev/sda2 thanks Kay - if you know about similar problem in current factory please add note. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c17
--- Comment #17 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c18
Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c19
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c20
Kay Sievers
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c21
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c22
--- Comment #22 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c23
Michal Marek
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c24
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c25
--- Comment #25 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c26
Michal Marek
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c27
Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c28
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c29
--- Comment #29 from Klaus Kaempf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c30
--- Comment #30 from Josef Reidinger
jfyi, "mkinitrd" shows multiple lines of 2010-04-20 09:58:45 WARNING: GRUB::GrubDev2UnixDev: No partition found for /dev/sda with 1.
Yes, this also prints pbl, it is result of failed consistent name translation. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c31
Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c32
--- Comment #32 from Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c33
Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c
Kay Sievers
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c34
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c35
--- Comment #35 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c36
Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c37
Josef Reidinger
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c38
Klaus Kämpf
Klaus - does zypper dup change device.map? How should I know ? It should not change it. Do you have correct one?
There is no 'correct' or 'wrong' device.map Whats wrong is the detection of which device was used for booting
I also cannot find any valid menu.lst each one which I see in logs has broken hd(0,0) so is menu.lst fixed before upgrade? I manually fix menu.lst because I cannot boot otherwise.
(perl-Bootloader during update/upgrade doesn't 'fix' menu, it just try to change in same way as it works before)
How does perl-Bootloader determine what to use as 'root' device ? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c39
Josef Reidinger
(In reply to comment #37)
Klaus - does zypper dup change device.map? How should I know ? It should not change it. Do you have correct one?
There is no 'correct' or 'wrong' device.map Whats wrong is the detection of which device was used for booting
device.map reflect order of booting for x86 based machines. BIOS code 0x80 is for first device to boot etc...it is important mainly when you write boot code to MBR.
I also cannot find any valid menu.lst each one which I see in logs has broken hd(0,0) so is menu.lst fixed before upgrade? I manually fix menu.lst because I cannot boot otherwise.
Interesting is that I cannot find correct one logs. Could you specify date when you update ( There is quite a lot of message, so time could help me to better identify it).
(perl-Bootloader during update/upgrade doesn't 'fix' menu, it just try to change in same way as it works before)
How does perl-Bootloader determine what to use as 'root' device ?
When you update the root device is device which is mounted under /boot (or / if /boot is not separated). So it try to find if there is some device which has as root device which is mounted as /boot. If it is not found then add default arguments and add it. Of course if it is broken before update it doesn't remove old entries as It think that it is for another system. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c40
Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c41
--- Comment #41 from Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c
Klaus Kämpf
http://bugzilla.novell.com/show_bug.cgi?id=591700
http://bugzilla.novell.com/show_bug.cgi?id=591700#c42
--- Comment #42 from Josef Reidinger
Why does bootloader try to newly guess grubs root() device during upgrade ? It should just leave it alone !
Because some users have more then one image section with different roots, so set root to /boot is done always as part of workflow ( it is part of pbl abstraction for another bootloader, you always set root by unix device and for grub it is translated to grub device) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c43
--- Comment #43 from Josef Reidinger
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c44
--- Comment #44 from Felix Miata
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c45
Christian Boltz
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c46
--- Comment #46 from Josef Reidinger
I had a similar problem after a zypper dup from 11.2 to 11.3 - menu.lst contained hd(0,0) instead of hd(0,1), which (obviously) broke booting. What should I do? a) report more details here b) open a new bugreport c) nothing - if you think it is the same bug as described here ?
BTW: If I get this bugreport right, it affects only zypper dup. This should even be fixable by releasing an update for 11.3 - at least for those people who include the update repo when running zypper dup.
I think that your problem is little different and is affected by change of udev DB format which leads to situation, that if udev is upgraded earlier then kernel, kernel cannot get translation between udev links and kernel device, which leads to bad GRUB device. more in bug #543076 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c47
--- Comment #47 from Christian Boltz
I think that your problem is little different and is affected by change of udev DB format which leads to situation, that if udev is upgraded earlier then kernel, kernel cannot get translation between udev links and kernel device, which leads to bad GRUB device. more in bug #543076
Yes, that sounds like a valid explanation, and rpm -qi confirms that udev was installed earlier than the kernel. Maybe I'm thinking too simple, but shouldn't this be solvable be releasing an udev update that requries kernel >= $11.3_original_kernel to give the solver a hint about the installation order? (Precondition is of course that the update repo is added when running zypper dup.) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c48
--- Comment #48 from Josef Reidinger
(In reply to comment #46)
I think that your problem is little different and is affected by change of udev DB format which leads to situation, that if udev is upgraded earlier then kernel, kernel cannot get translation between udev links and kernel device, which leads to bad GRUB device. more in bug #543076
Yes, that sounds like a valid explanation, and rpm -qi confirms that udev was installed earlier than the kernel.
Maybe I'm thinking too simple, but shouldn't this be solvable be releasing an udev update that requries kernel >= $11.3_original_kernel to give the solver a hint about the installation order? (Precondition is of course that the update repo is added when running zypper dup.)
Hi, discussion about possible solution please post to mentioned bug, as it is hard to track it in more bug reports. Thanks -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c50
Jean-Daniel Dodin
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c51
Michael Chang
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c52
--- Comment #52 from Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c54
Steffen Winterfeldt
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c55
Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c56
Steffen Winterfeldt
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c57
--- Comment #57 from Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c58
--- Comment #58 from Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c59
Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c60
--- Comment #60 from Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c61
--- Comment #61 from Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c62
--- Comment #62 from Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c63
--- Comment #63 from Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c64
--- Comment #64 from Klaus Kämpf
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c65
Duncan Mac-Vicar
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c66
Steffen Winterfeldt
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c68
Duncan Mac-Vicar
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c
Duncan Mac-Vicar
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c69
Steffen Winterfeldt
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c70
Werner Heisch
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c71
--- Comment #71 from Steffen Winterfeldt
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c
Steffen Winterfeldt
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c72
--- Comment #72 from Werner Heisch
https://bugzilla.novell.com/show_bug.cgi?id=591700
https://bugzilla.novell.com/show_bug.cgi?id=591700#c73
Steffen Winterfeldt
participants (1)
-
bugzilla_noreply@novell.com