[Bug 758166] New: [Kernel:HEAD:ACPI] Dell XPS 15z needs "nox2apic" to boot
https://bugzilla.novell.com/show_bug.cgi?id=758166 https://bugzilla.novell.com/show_bug.cgi?id=758166#c0 Summary: [Kernel:HEAD:ACPI] Dell XPS 15z needs "nox2apic" to boot Classification: openSUSE Product: openSUSE 12.1 Version: Final Platform: x86-64 OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Kernel AssignedTo: trenn@novell.com ReportedBy: crrodriguez@opensuse.org QAContact: qa-bugs@suse.de Found By: --- Blocker: --- Created an attachment (id=487069) --> (http://bugzilla.novell.com/attachment.cgi?id=487069) This one was obtained attaching an USB external keyboard. This is a (late) follow up to this thread on opensuse-kernel lists http://lists.opensuse.org/opensuse-kernel/2011-12/msg00039.html Finally, it turns out that the box only works booting with nox2apic, otherwise it either: - freezes at boot OR - keyboard/mouse controller stops working and/or - Wifi card needs "waiting" rmmod iwlwifi and modprobe again to work. Attached are the dmesg when it works and when it does not. Hope this can help to track down this issue. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c1
--- Comment #1 from Cristian Rodríguez
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c2
Thomas Renninger
Switched APIC routing to cluster x2apic.
This patch could help: git show ea0dcf903e7d76aa5d483d876215fedcfdfe140f x86/apic: Use x2apic physical mode based on FADT setting You can check before compiling or installing a kernel by: mkdir /tmp/acpi cd /tmp/acpi acpidump >acpidump acpixtract -a acpidump iasl -d FACP.dat # Sometimes several FACPs (is FADT) are provided, # Use the one with the new revision in the header # Also on factory the files are generated in lower # case letters. less FACP.dsl Look out for: Use APIC Physical Destination Mode (V4) : 0 If it's 1, then the patch will likely help. Afaik (did not check now, but did some time ago) there is no possibility to override the apic mode via boot param for testing. This is worth a patch (if some extra time is left...). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c
Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c3
Cristian Rodríguez
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c4
--- Comment #4 from Cristian Rodríguez
Afaik (did not check now, but did some time ago) there is no possibility to override the apic mode via boot param for testing.
The option is x2apic_phys "x2apic_phys [X86-64,APIC] Use x2apic physical mode instead of default x2apic cluster mode on platforms supporting x2apic." already tested it before and in Rc6 where the patch you mentioned is already present, same result... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c5
--- Comment #5 from Cristian Rodríguez
It says:
Use APIC Cluster Model (V4) : 0 Use APIC Physical Destination Mode (V4) : 0
..
Question, does the boot parameter "nox2apic" influences the output of acpidump ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c6
--- Comment #6 from Thomas Renninger
does the boot parameter "nox2apic" influences the output of acpidump? No.
I guess all that can be done then is to blacklist the machine to not use x2apic. Can you attach dmidecode, please. Would be interesting whether x2apic is enabled on Windows at all, but I have no idea how to check that. If possible you could provide boot messages with: apic=debug debug boot parameters added for reference. Maybe there is something suspicious, but this all points to bad HW/BIOS (looking for a BIOS update regularly is a good idea). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c7
--- Comment #7 from Cristian Rodríguez
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c8
--- Comment #8 from Cristian Rodríguez
does the boot parameter "nox2apic" influences the output of acpidump? No.
I guess all that can be done then is to blacklist the machine to not use x2apic. Can you attach dmidecode, please.
Would be interesting whether x2apic is enabled on Windows at all, but I have no idea how to check that.
If possible you could provide boot messages with: apic=debug debug boot parameters added for reference. Maybe there is something suspicious, but this all points to bad HW/BIOS (looking for a BIOS update regularly is a good idea).
I will try to take a look on what windows is doing in that respect. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c9
--- Comment #9 from Thomas Renninger
Also toggling this kernel option OFF: config IRQ_REMAP ... workarounds the problem... Have you compiled it out or have you just used the boot param intremap=off ?
That's weird or say a good finding: Booting with nox2apic also enables interrupt remapping: Enabled IRQ remapping in xapic mode (cmp. with your logs/dmesg in your snd attachment). The logs there have some nice debug output. If it's somehow possible to get a failing boot with these "apic=debug acpi.debug_layer=0x400000 i8042.debug debug" as well, we would have something we can compare good vs bad in detail. This could get a very nasty one.., I try to get some time for it, but as there is a workaround (nox2apic) and my queue is rather filled right now, it may take some time for the one or other answer. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c11
--- Comment #11 from Cristian Rodríguez
Also toggling this kernel option OFF: config IRQ_REMAP ... workarounds the problem... Have you compiled it out or have you just used the boot param intremap=off ?
Both.. booting with intremap=off says: "Enabled IRQ remapping failed" (or something similar I do not have the exact message in memory) When compiled without IRQ_REMAP It does say nothing about IRQ remapping.
That's weird or say a good finding: Booting with nox2apic also enables interrupt remapping: Enabled IRQ remapping in xapic mode (cmp. with your logs/dmesg in your snd attachment).
Yes, It goes into "legacy xapic" mode (according to the somehow convulted intel docs)
The logs there have some nice debug output. If it's somehow possible to get a failing boot with these "apic=debug acpi.debug_layer=0x400000 i8042.debug debug"
I already try that, even built the kernel to panic on soft/hard lockups, oppses, lockdep debugging etc.. and acpi either does not say a word and "hangs" without message, OR it continues booting with non-functional keyboard/touchpad, once I plug an external keyboard, I can use the system but without wifi @_@ .. as the Intel ucode for the iwlwifi driver fails to load... (that is extremely weird, because if I remove the driver, wait a few minutes and reload again, it works...)
as well, we would have something we can compare good vs bad in detail.
This could get a very nasty one.., I try to get some time for it, but as there is a workaround (nox2apic) and my queue is rather filled right now, it may take some time for the one or other answer.
No hurry, I have not yet seen any negative effect of the workaround and constantly checking for BIOS updates as I'm more and more convinced this is a BIOS bug. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c12
--- Comment #12 from Cristian Rodríguez
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c13
--- Comment #13 from Cristian Rodríguez
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c14
--- Comment #14 from Cristian Rodríguez
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c15
--- Comment #15 from Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c16
--- Comment #16 from Cristian Rodríguez
Wow, interesting.
Not sure, but it could have to do with these dmesg lines:
drm: registered panic notifier [Firmware Bug]: ACPI(PEGP) defines _DOD but not _DOS
This persists.
APIC error on CPU0: 00(80) : Illegal register address APIC error on CPU0: 80(80) : Illegal register address
This messages have gone away in recent kernels.
acpi device:2e: registered as cooling_device4 video: probe of LNXVIDEO:00 failed with error -5 acpi device:32: registered as cooling_device4
Iirc you already put acpidump somewhere, can you attach it to this bug as well, please.
OK, Will upload it later .
probe the LNXVIDEO:01 and whether the machine is stable then (and the APIC errors do not show up) and backlight still works, etc.
I am using just one of the video adapters.. the intel one .. and the backlight works just fine.
I could imagine there is an access to an APIC register which is not allowed in X2APIC mode
Yeah, I think we are near the root cause here.. ;) however it is *very* hairy to sort out and get consistent behaviour.. I would love to be able to make the kernel panic when stuff goes wrong, however I just end up with early hangs at random points ;( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c
Ihno Krumreich
https://bugzilla.novell.com/show_bug.cgi?id=758166
https://bugzilla.novell.com/show_bug.cgi?id=758166#c17
Cristian Rodríguez
http://bugzilla.novell.com/show_bug.cgi?id=758166
--- Comment #18 from Cristian Rodríguez
http://bugzilla.novell.com/show_bug.cgi?id=758166
Jiri Slaby
http://bugzilla.novell.com/show_bug.cgi?id=758166
--- Comment #20 from Cristian Rodríguez
So can we close given 3.18 is now in stable and is going to factory?
Yeah, I am running linus' tree on the machine now and it works ok (SUSE kernel HEAD also fine) -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com