[Bug 482220] New: IRQ delivery problem (?) for sata_sil24 controller when boot @ kernel-xen
https://bugzilla.novell.com/show_bug.cgi?id=482220 Summary: IRQ delivery problem (?) for sata_sil24 controller when boot @ kernel-xen Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: x86-64 OS/Version: openSUSE 11.1 Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: pgnet.trash@gmail.com QAContact: qa@suse.de Found By: --- Created an attachment (id=277214) --> (https://bugzilla.novell.com/attachment.cgi?id=277214) console output for failed kernel-xen boot User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 subsequent to a resolution of, [Bug 463829] OS 11.0 fails drive mount via Sil 3124 sata card; https://bugzilla.novell.com/show_bug.cgi?id=463829 which fixed sata_sil24 operation for kernel-default (as of KOTD, kernel-default-2.6.27.19-SLE11_BRANCH_20090304073920_1eb029c9), booting the same system -- with external drives attached via a sata_sil24 card -- to kernel-xen (same KOTD build #) fails to boot, dropping me to a root login @ 'maintenance mode'. per 'teheo at novell', as adding "iommu=usedac" makes no difference, opening here as a new/separate bug. @console output from failed boot to kernel-xen here as attachment Reproducible: Always Steps to Reproduce: 1. 2. 3. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220
Marcus Meissner
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c1
--- Comment #1 from pgnet _
https://bugzilla.novell.com/show_bug.cgi?id=482220
pgnet _
https://bugzilla.novell.com/show_bug.cgi?id=482220
User gregkh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c2
Greg Kroah-Hartman
https://bugzilla.novell.com/show_bug.cgi?id=482220
Greg Kroah-Hartman
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c3
pgnet _
It's hard to make something a "blocker" bug when the product has already shipped :)
fine, re: 11.1 <user perspective> .. though, "blocker" does depend a bit on your perspective. these (rather long-standing, pre 11.1) issues with sil24 on xen _are_ standing squarely in the way of a client migration of a several hundred boxes from Win to *Suse (~30% SLED, <10% SLES, rest OS 11.whatever-works, + service contracts). we can neither deploy nor develop for those deployments, and are a bit reticent to cut the legs out from under the "it's more cost effective" argument by suggesting they replace their currently fully-functional (under Win) hardware. it officially became a 'blocker' with in-hand contractual language that says _demonstrate_ it works _now_. it doesn't -- and, given current understanding of SLE* development sched -- any fixes won't be in a 1st relase at this late date ... as the current @boot state is non-responsive, requiring direct intervention/recovery @ directly-connected console, how 'bout we @ least leave it at the original 'critical' ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220
Cyril Hrubis
https://bugzilla.novell.com/show_bug.cgi?id=482220
Clyde Griffin
https://bugzilla.novell.com/show_bug.cgi?id=482220
User cgriffin@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c5
Clyde Griffin
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c6
--- Comment #6 from pgnet _
Do you get the same behavior with SLES 11 code base?
we have not tried it on SLES 11 pre-release. we've been seeing RAID related issues since openSUSE 11.0 (e.g., https://bugzilla.novell.com/show_bug.cgi?id=461673), but -- with prodding from @novell -- upgraded to openSUSE 11.1. since then, to be clear, there's been _no_ further testing on this specific issue withh anything other than the openSUSE 11.1 code base. 'we' (someone here other than me ...) did speak with @novell some time ago, and were informed that no support -- bugs or otherwise -- would be provided on SLES pre-release. which, to me seems counterproductive ... :-/ that said, the current 'testing' kernel-xen & kernel-default in this ticket _are_ from the *SLE11_BRANCH* KOTD repo. but, again, everything else is 'just' fully-updated openSUSE 11.1. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220
User cgriffin@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c7
--- Comment #7 from Clyde Griffin
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c8
--- Comment #8 from pgnet _
Strange, we are always interested in fixing bugs during the pre-release cycle.
likely depends on which "we" was asked, sales or engineering. just my $.02 ...
Please give SLES 11 a try as soon as you get a chance. This will let us know if we are dealing with an already fixed problem or not narrowing things down a bit.
understood & point taken. though, tbh, a tear-down/delay of work on openSUSE 11.1 RELEASE to test pre-release SLES 11, at this point, will be ... er ... 'frowned upon' here. given that signigificant fixes/updates won't be pushed anymore for OS 11.0, as it's maintenance, and that SLES 10x simply didn't cut it for our needs, mgmt's getting "version weary" quickly. i'll see what i can do ... but we really need to demonstrate this internally (& @client) as fixed for openSUSE RELEASE 11.1. _and_, eventually, for SLES/SLED 11. note: teheo@novell had suggested "IRQ Delivery" as the (possible) issue. in an offlist exchange w/ J. Fitzhardinge @ citrix, after an admittedly cursory look at this, he suggested that is does NOT look like IRQ ... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220
User teheo@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c9
--- Comment #9 from Tejun Heo
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c10
--- Comment #10 from pgnet _
pgnet, what did Jeremy say? Unfortunately, I'm quite lost when it comes to paravirt. :-)
my ping @him was mainly to ask if his post was at all relevant to this bug. he briefly/politely said no, as _he_ is working on integrating citrix/xen into the upstream kernel (kvm? pvops?), and that "novell does their own work" on xen .. also, that in a _very_ cursory look at this bug, it did not look like IRQ issues to him. seemed clear to me that he didn't see it as his issue ... beyond that, alas, not much. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c11
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c12
--- Comment #12 from pgnet _
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c13
--- Comment #13 from pgnet _
https://bugzilla.novell.com/show_bug.cgi?id=482220
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c14
--- Comment #14 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c15
--- Comment #15 from pgnet _
For the time being, applying the equivalent workaround from that bug (mem=4G, but this has to go on the Xen command line here) should get you going.
that simple? ouch ... adding to menu.lst, kernel /xen.gz ... mem=4G boots, without apparent error, to uname -a Linux server 2.6.27.19-SLE11_BRANCH_20090304073920_1eb029c9-xen #1 SMP 2009-03-04 08:39:20 +0100 x86_64 x86_64 x86_64 GNU/Linux the externally-attached SATA drives are correctly seen under -xen, and an overnite disk 'stress-test' ( using >9GB data files, ala https://bugzilla.novell.com/show_bug.cgi?id=463829#c79 ) return consistent results, and no errors. it seems the workaround works. iiuc, what remains is for the solution to percolate into release-branch. happy to test/check when it does. thanks! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220
Jason Douglas
https://bugzilla.novell.com/show_bug.cgi?id=482220
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c16
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=482220
User pgnet.trash@gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=482220#c17
--- Comment #17 from pgnet _
Patches committed.
and available as of, uname -ri 2.6.27.21-2-xen x86_64 from, .../Kernel:/SL111_BRANCH/openSUSE_11.1/ removing 'mem=4G' from the xen grub config, all's well -- and the sata_sil24-attached drives are up & available. thanks! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com