[Bug 1006175] New: zypper dup from leap421 renders system unbootable
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175 Bug ID: 1006175 Summary: zypper dup from leap421 renders system unbootable Classification: openSUSE Product: openSUSE Distribution Version: Leap 42.2 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Installation Assignee: yast2-maintainers@suse.de Reporter: per@computer.org QA Contact: jsrain@suse.com Found By: --- Blocker: --- The dup'ing went fine, but the initrd cannot boot as it seem to be missing the necessary disk controller module(s). Or at least one - "scsi_dh". -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
Andrei Borzenkov
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c1
--- Comment #1 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c11
--- Comment #11 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c13
--- Comment #13 from Per Jessen
So what is the status here?
a) with RC2, fresh installation not possible, hardware isn't recognised. b) with the patch, I was able to boot an installation system, but I was unable to create an initrd that would look for the right device (/dev/sda instead of /dev/ccis/c0d0). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c14
--- Comment #14 from Hannes Reinecke
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c28
Per Jessen
Created attachment 702681 [details] DUD for Leap42.2
Please try this DUD. I tested the equivalent DUD for SLES12 SP2, but I couldn't test Leap42.2 because it's not on the PXE server for the lab yet.
The parameter hpsa.hpsa_allow_any is NOT needed (I changed the default, as this driver has been made explicitly for CCISS device support).
Sources in IBS project home:mwilck:bsc1006175.
Hi Martin I retrieved the dud and put on our tftp server, then I booted an install system with "dud=tftp://server/install/hpsa.dud". The process seems to work fine, but the module didn't load, "invalid module format". I presume due to a mismatch in kernel versions, the installer is 4.4.27, the updated hpsa is 4.4.21-69. I tried loading with -f too, didn't work either. I was going to try to load it into the updated system on boot-up, but I expect I would have the same problem. It's the first time I try using a DUD, I might well have missed something? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c29
Martin Wilck
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c30
--- Comment #30 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c31
--- Comment #31 from Martin Wilck
Okay, I can confirm the update hpsa works. I'm not sure if the dud process worked as a whole, I still needed to insert the module manually
During my own testing, I found that module loading of the updated HPSA driver indeed didn't occur immediately when the DUD had been loaded, but later on when YaST displayed the "detecting hardware" screen. Can you confirm that? This is the first time I'm building a DUD with the SUSE internal toolset, so I may have overlooked something here. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
Martin Wilck
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c34
--- Comment #34 from Martin Wilck
During my own testing, I found that module loading of the updated HPSA driver indeed didn't occur immediately when the DUD had been loaded, but later on when YaST displayed the "detecting hardware" screen.
I just tested a full SLES12 SP2 installation on host "salieri" in the lab, and it worked just like this. The updated hpsa driver was loaded after YaST started (after the "License Agreement" and "Registration" screens). Installation completed successfully without any command-line actions. Firstboot succeeded. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c39
--- Comment #39 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c40
Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c41
Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c43
--- Comment #43 from Martin Wilck
According to 'lsmod', hpsa was not loaded.
Strange. There must be some difference between SLES12 and Leap 42.2, then. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c44
Martin Wilck
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c45
Martin Wilck
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c51
Per Jessen
Created attachment 703168 [details] Another dud image
Per, please try this one. Modules should now be autoloaded right after the DUD was unpacked.
(In reply to Martin Wilck from comment #45)
Created attachment 703185 [details] dud with fixes for bsc#1010665
Updated the driver again. Now it also contains Hannes' fix for bug 1010665 'Continuous messages 'hpsa 0000:0b:08.0: addition failed -19, device not added'.
Okay, now it works - and thanks for fixing those messages too. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
Martin Wilck
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c78
--- Comment #78 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c87
--- Comment #87 from Per Jessen
(In reply to Ludwig Nussel from comment #82)
What's left to do here? the kernel seems to be released, what about sg3_utils? Also a warning in the release notes would be good still
The sg3_utils changes aren't yet queued for maintenance. Also I'm still waiting for feedback on the latest DUD (cf. comment 80).
I'm sorry, I haven't had the time yet. (the system has actually been waiting at the installer prompt for 6 days). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c88
--- Comment #88 from Per Jessen
(In reply to Martin Wilck from comment #85)
cciss.cciss_allow_hpsa=0
Per, have you ever tried that?
To make myself clear: without DUD, and without hpsa.hpsa_allow_any=1.
I'll try it and let you know. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c89
--- Comment #89 from Per Jessen
(In reply to Martin Wilck from comment #86)
(In reply to Martin Wilck from comment #85)
cciss.cciss_allow_hpsa=0
Per, have you ever tried that?
To make myself clear: without DUD, and without hpsa.hpsa_allow_any=1.
I'll try it and let you know.
I have a fresh installation of Leap421, You mean: run the zypper dup to Leap422, then try to boot : a) without DUD and b) without hpsa.hpsa_allow_any=1 and c) with cciss.cciss_allow_hpsa=0 Isn't the problem still going to be that the cciss module does not recognise the HP Smart Array 6i controller? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c90
--- Comment #90 from Martin Wilck
I have a fresh installation of Leap421, You mean: run the zypper dup to Leap422, then try to boot :
a) without DUD and b) without hpsa.hpsa_allow_any=1 and c) with cciss.cciss_allow_hpsa=0
Isn't the problem still going to be that the cciss module does not recognise the HP Smart Array 6i controller?
That's what I'd like to clarify. "cciss_allow_hpsa" is set to 1 by default. The meaning of flag (which is sort of misnamed IMO) is "Prevent cciss driver from accessing hardware known to be supported by the hpsa driver". Thus by setting the flag to 0, the cciss driver might actually detect that hardware again. ... which would basically obsolete all 89 comments made on this bug so far. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c91
--- Comment #91 from Per Jessen
(In reply to Per Jessen from comment #89)
I have a fresh installation of Leap421, You mean: run the zypper dup to Leap422, then try to boot :
a) without DUD and b) without hpsa.hpsa_allow_any=1 and c) with cciss.cciss_allow_hpsa=0
Isn't the problem still going to be that the cciss module does not recognise the HP Smart Array 6i controller?
That's what I'd like to clarify.
"cciss_allow_hpsa" is set to 1 by default. The meaning of flag (which is sort of misnamed IMO) is "Prevent cciss driver from accessing hardware known to be supported by the hpsa driver". Thus by setting the flag to 0, the cciss driver might actually detect that hardware again.
Ah, got it. It ought to be enough if I just install and boot the kernel from Leap422 then? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c92
--- Comment #92 from Martin Wilck
It ought to be enough if I just install and boot the kernel from Leap422 then?
Yes, with the mentioned parameter. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c93
Per Jessen
(In reply to Per Jessen from comment #91)
It ought to be enough if I just install and boot the kernel from Leap422 then?
Yes, with the mentioned parameter.
I installed 4.4.27-2-default from Leap422, and amended my lilo config: image = /boot/vmlinuz-4.4.27-2-default label = openSUSE2 append = " noresume cciss.cciss_allow_hpsa=0" initrd = /boot/initrd-4.4.27-2-default root = /dev/cciss/c0d0p1 and rebooted. It worked, # uname -a Linux test99 4.4.27-2-default #1 SMP Thu Nov 3 14:59:54 UTC 2016 (5c21e7c) x86_64 x86_64 x86_64 GNU/Linux -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c103
Andreas Stieger
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175
http://bugzilla.opensuse.org/show_bug.cgi?id=1006175#c107
--- Comment #107 from Per Jessen
participants (1)
-
bugzilla_noreply@novell.com