http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c0
Summary: kernel-ppc64 boot on PowerMac G5 crashes during boot Classification: openSUSE Product: openSUSE 11.3 Version: Factory Platform: Other OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: meissner@novell.com QAContact: qa@suse.de Found By: Development Blocker: ---
Created an attachment (id=356351) --> (http://bugzilla.novell.com/attachment.cgi?id=356351) oops.jpg
crashes in ext3_statfs during boot, see attached Oops screenshot.
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c1
--- Comment #1 from Marcus Meissner meissner@novell.com 2010-04-22 21:16:59 UTC --- the place of the code is:
c0000000002a66b0: f8 1b 00 28 std r0,40(r27) c0000000002a66b4: 48 0d de 2d bl c0000000003844e0 <.__percpu_counter_sum> c0000000002a66b8: 60 00 00 00 nop c0000000002a66bc: 7c 69 18 f8 not r9,r3
fun is that "nop" appears to be the crashing instruction, which is probably that __percpu_counter_sum does something weird.
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c2
--- Comment #2 from Marcus Meissner meissner@novell.com 2010-04-23 09:26:44 UTC --- Created an attachment (id=356426) --> (http://bugzilla.novell.com/attachment.cgi?id=356426) firstoops.jpg
first oops, crashes in "hid_output_field".
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c3
--- Comment #3 from Jeff Mahoney jeffm@novell.com 2010-04-23 13:24:58 UTC --- Created an attachment (id=356496) --> (http://bugzilla.novell.com/attachment.cgi?id=356496) Second oops, scaled and rotated
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c4
Jeff Mahoney jeffm@novell.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium CC| |jeffm@novell.com AssignedTo|kernel-maintainers@forge.pr |jkosina@novell.com |ovo.novell.com |
--- Comment #4 from Jeff Mahoney jeffm@novell.com 2010-04-23 13:27:10 UTC --- The oops is in hid_output_field+0xec. Sending to Jiri.
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c5
--- Comment #5 from Marcus Meissner meissner@novell.com 2010-04-23 13:54:02 UTC ---
NIP is c00000000057db9c: 7d 76 04 28 .long 0x7d760428
my feeling is that this is the faulting instruction, even if instruction dump has something else.
Very curious that the assembler is not getting disassembled here.
it seems to be in drivers/hid/hid-core.c::implement(), the line
x = get_unaligned_le64(report);
and signal 4 seems "SIGILL".
So I guess it is emitting an instruction my PPC970 cannot handle. :(
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c6
--- Comment #6 from Marcus Meissner meissner@novell.com 2010-04-23 19:05:57 UTC --- filed http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43871
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c7
--- Comment #7 from Marcus Meissner meissner@novell.com 2010-04-23 19:13:46 UTC --- if we can disable CONFIG_TUNE_CELL=y
in the config/ppc/ppc64 file it should probsably be sufficent.
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c8
Marcus Meissner meissner@novell.com changed:
What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|jkosina@novell.com |jeffm@novell.com
--- Comment #8 from Marcus Meissner meissner@novell.com 2010-04-26 08:45:50 UTC --- jeff,
i changed CONFIG_TUNE_CELL=n
in config/ppc/ppc64
built and bootet the kernel and it is working now.
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c9
Jeff Mahoney jeffm@novell.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED
--- Comment #9 from Jeff Mahoney jeffm@novell.com 2010-04-26 13:37:38 UTC --- With gcc 4.5?
That's strange. The help for that entry says:
Cause the compiler to optimize for the PPE of the Cell Broadband Engine. This will make the code run considerably faster on Cell but somewhat slower on other machines. This option only changes the scheduling of instructions, not the selection of instructions itself, so the resulting kernel will keep running on all other machines. When building a kernel that is supposed to run only on Cell, you should also select the POWER4_ONLY option.
I'll disable it anyway.
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c10
--- Comment #10 from Marcus Meissner meissner@novell.com 2010-04-26 13:42:28 UTC --- gcc is a buggy here, as -mcpu=power4 -mtune=cell currently emits non-power4 instructions (in the case specifically a 64bit endiane converting load instruction)
gcc bug is open and acknowledged, but not yset fixed. its a 4.5 issue, yes.
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c11
Jeff Mahoney jeffm@novell.com changed:
What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|jeffm@novell.com |rguenther@novell.com
--- Comment #11 from Jeff Mahoney jeffm@novell.com 2010-04-26 13:46:09 UTC --- Ok, I'll bounce this to Richard then. We're working around it in the kernel config.
http://bugzilla.novell.com/show_bug.cgi?id=599045
http://bugzilla.novell.com/show_bug.cgi?id=599045#c12
Richard Guenther rguenther@novell.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |UPSTREAM
--- Comment #12 from Richard Guenther rguenther@novell.com 2010-05-19 11:34:58 UTC --- Upstream tracks this.
http://bugzilla.novell.com/show_bug.cgi?id=599045 http://bugzilla.novell.com/show_bug.cgi?id=599045#c13
--- Comment #13 from Bernhard Wiedemann bwiedemann@suse.com --- This is an autogenerated message for OBS integration: This bug (599045) was mentioned in https://build.opensuse.org/request/show/39554 Factory / kernel-source