[Bug 482220] New: IRQ delivery problem (?) for sata_sil24 controller when boot @ kernel-xen
https://bugzilla.novell.com/show_bug.cgi?id=482220 Summary: IRQ delivery problem (?) for sata_sil24 controller when boot @ kernel-xen Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: x86-64 OS/Version: openSUSE 11.1 Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: pgnet.trash@gmail.com QAContact: qa@suse.de Found By: --- Created an attachment (id=277214) --> (https://bugzilla.novell.com/attachment.cgi?id=277214) console output for failed kernel-xen boot User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 subsequent to a resolution of, [Bug 463829] OS 11.0 fails drive mount via Sil 3124 sata card; https://bugzilla.novell.com/show_bug.cgi?id=463829 which fixed sata_sil24 operation for kernel-default (as of KOTD, kernel-default-2.6.27.19-SLE11_BRANCH_20090304073920_1eb029c9), booting the same system -- with external drives attached via a sata_sil24 card -- to kernel-xen (same KOTD build #) fails to boot, dropping me to a root login @ 'maintenance mode'. per 'teheo at novell', as adding "iommu=usedac" makes no difference, opening here as a new/separate bug. @console output from failed boot to kernel-xen here as attachment Reproducible: Always Steps to Reproduce: 1. 2. 3. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 Marcus Meissner <meissner@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.pr |kernel-maintainers@forge.pr |ovo.novell.com |ovo.novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c1 --- Comment #1 from pgnet _ <pgnet.trash@gmail.com> 2009-03-09 10:19:58 MST --- while reading up on 'irq delivery', xen, sil24, etc, i keep coming across the same few post(s). not at all clear to me if relevant to this issue, but thought i'd simply mention for reference, http://lkml.org/lkml/2008/11/13/303 http://www.google.com/search?q=IRQ-from-evtchn -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 pgnet _ <pgnet.trash@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Critical |Blocker -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User gregkh@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c2 Greg Kroah-Hartman <gregkh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Blocker |Major --- Comment #2 from Greg Kroah-Hartman <gregkh@novell.com> 2009-03-10 14:50:17 MST --- It's hard to make something a "blocker" bug when the product has already shipped :) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 Greg Kroah-Hartman <gregkh@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|kernel-maintainers@forge.pr |teheo@novell.com |ovo.novell.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c3 pgnet _ <pgnet.trash@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Major |Critical --- Comment #3 from pgnet _ <pgnet.trash@gmail.com> 2009-03-10 15:16:34 MST --- (In reply to comment #2)
It's hard to make something a "blocker" bug when the product has already shipped :)
fine, re: 11.1 <user perspective> .. though, "blocker" does depend a bit on your perspective. these (rather long-standing, pre 11.1) issues with sil24 on xen _are_ standing squarely in the way of a client migration of a several hundred boxes from Win to *Suse (~30% SLED, <10% SLES, rest OS 11.whatever-works, + service contracts). we can neither deploy nor develop for those deployments, and are a bit reticent to cut the legs out from under the "it's more cost effective" argument by suggesting they replace their currently fully-functional (under Win) hardware. it officially became a 'blocker' with in-hand contractual language that says _demonstrate_ it works _now_. it doesn't -- and, given current understanding of SLE* development sched -- any fixes won't be in a 1st relase at this late date ... </user perspective> as the current @boot state is non-responsive, requiring direct intervention/recovery @ directly-connected console, how 'bout we @ least leave it at the original 'critical' ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 Cyril Hrubis <chrubis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.pr |cgriffin@novell.com |ovo.novell.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 Clyde Griffin <cgriffin@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|cgriffin@novell.com |jbeulich@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User cgriffin@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c5 Clyde Griffin <cgriffin@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |cgriffin@novell.com --- Comment #5 from Clyde Griffin <cgriffin@novell.com> 2009-03-13 13:40:13 MST --- Do you get the same behavior with SLES 11 code base? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c6 --- Comment #6 from pgnet _ <pgnet.trash@gmail.com> 2009-03-13 13:51:40 MST --- (In reply to comment #5)
Do you get the same behavior with SLES 11 code base?
we have not tried it on SLES 11 pre-release. we've been seeing RAID related issues since openSUSE 11.0 (e.g., https://bugzilla.novell.com/show_bug.cgi?id=461673), but -- with prodding from @novell -- upgraded to openSUSE 11.1. since then, to be clear, there's been _no_ further testing on this specific issue withh anything other than the openSUSE 11.1 code base. 'we' (someone here other than me ...) did speak with @novell some time ago, and were informed that no support -- bugs or otherwise -- would be provided on SLES pre-release. which, to me seems counterproductive ... :-/ that said, the current 'testing' kernel-xen & kernel-default in this ticket _are_ from the *SLE11_BRANCH* KOTD repo. but, again, everything else is 'just' fully-updated openSUSE 11.1. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User cgriffin@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c7 --- Comment #7 from Clyde Griffin <cgriffin@novell.com> 2009-03-13 13:57:43 MST --- Strange, we are always interested in fixing bugs during the pre-release cycle. Please give SLES 11 a try as soon as you get a chance. This will let us know if we are dealing with an already fixed problem or not narrowing things down a bit. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c8 --- Comment #8 from pgnet _ <pgnet.trash@gmail.com> 2009-03-13 15:22:41 MST --- (In reply to comment #7)
Strange, we are always interested in fixing bugs during the pre-release cycle.
likely depends on which "we" was asked, sales or engineering. just my $.02 ...
Please give SLES 11 a try as soon as you get a chance. This will let us know if we are dealing with an already fixed problem or not narrowing things down a bit.
understood & point taken. though, tbh, a tear-down/delay of work on openSUSE 11.1 RELEASE to test pre-release SLES 11, at this point, will be ... er ... 'frowned upon' here. given that signigificant fixes/updates won't be pushed anymore for OS 11.0, as it's maintenance, and that SLES 10x simply didn't cut it for our needs, mgmt's getting "version weary" quickly. i'll see what i can do ... but we really need to demonstrate this internally (& @client) as fixed for openSUSE RELEASE 11.1. _and_, eventually, for SLES/SLED 11. note: teheo@novell had suggested "IRQ Delivery" as the (possible) issue. in an offlist exchange w/ J. Fitzhardinge @ citrix, after an admittedly cursory look at this, he suggested that is does NOT look like IRQ ... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User teheo@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c9 --- Comment #9 from Tejun Heo <teheo@novell.com> 2009-03-13 17:38:24 MST --- Clyde, it's a device detection problem and pgnet is using SLE11 KOTD, does userland part make much difference (I really don't know)? pgnet, what did Jeremy say? Unfortunately, I'm quite lost when it comes to paravirt. :-) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c10 --- Comment #10 from pgnet _ <pgnet.trash@gmail.com> 2009-03-13 18:10:18 MST --- (In reply to comment #9)
pgnet, what did Jeremy say? Unfortunately, I'm quite lost when it comes to paravirt. :-)
my ping @him was mainly to ask if his post was at all relevant to this bug. he briefly/politely said no, as _he_ is working on integrating citrix/xen into the upstream kernel (kvm? pvops?), and that "novell does their own work" on xen .. also, that in a _very_ cursory look at this bug, it did not look like IRQ issues to him. seemed clear to me that he didn't see it as his issue ... beyond that, alas, not much. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c11 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Component|Kernel |Xen --- Comment #11 from Jan Beulich <jbeulich@novell.com> 2009-03-16 05:46:07 MST --- I'm pretty certain the issue here will be resolved as soon as the Xen kernel patch paralleling the one for bug 463829 gets committed (which it wasn't so far, i.e. also not in time for SLE11 GA). For the time being, applying the equivalent workaround from that bug (mem=4G, but this has to go on the Xen command line here) should get you going. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c12 --- Comment #12 from pgnet _ <pgnet.trash@gmail.com> 2009-03-16 10:45:03 MST --- On travel, but will try @ return in ~1 wk & report back asap Thx! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c13 --- Comment #13 from pgnet _ <pgnet.trash@gmail.com> 2009-03-16 10:52:10 MST --- Jan, isn't the patch you refer to already _in_ the sle11_branch kotd kernel I'm currently using/ trying ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c14 --- Comment #14 from Jan Beulich <jbeulich@novell.com> 2009-03-16 11:03:26 MST --- No, only the native kernels' patch is there. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c15 --- Comment #15 from pgnet _ <pgnet.trash@gmail.com> 2009-03-22 08:40:19 MST --- (In reply to comment #11)
For the time being, applying the equivalent workaround from that bug (mem=4G, but this has to go on the Xen command line here) should get you going.
that simple? ouch ... adding to menu.lst, kernel /xen.gz ... mem=4G boots, without apparent error, to uname -a Linux server 2.6.27.19-SLE11_BRANCH_20090304073920_1eb029c9-xen #1 SMP 2009-03-04 08:39:20 +0100 x86_64 x86_64 x86_64 GNU/Linux the externally-attached SATA drives are correctly seen under -xen, and an overnite disk 'stress-test' ( using >9GB data files, ala https://bugzilla.novell.com/show_bug.cgi?id=463829#c79 ) return consistent results, and no errors. it seems the workaround works. iiuc, what remains is for the solution to percolate into release-branch. happy to test/check when it does. thanks! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 Jason Douglas <jdouglas@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jdouglas@novell.com QAContact|qa@suse.de |jdouglas@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c16 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED --- Comment #16 from Jan Beulich <jbeulich@novell.com> 2009-03-24 05:31:38 MST --- Patches committed. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=482220 User pgnet.trash@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=482220#c17 --- Comment #17 from pgnet _ <pgnet.trash@gmail.com> 2009-03-25 09:00:28 MST --- (In reply to comment #16)
Patches committed.
and available as of, uname -ri 2.6.27.21-2-xen x86_64 from, .../Kernel:/SL111_BRANCH/openSUSE_11.1/ removing 'mem=4G' from the xen grub config, all's well -- and the sata_sil24-attached drives are up & available. thanks! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com