Hello everyone, (I'm posting this here in addition to suse-sles-e@suse.com as the sles-e mailing- list seems a bit deserted) I have experienced a few kernel oopses and I'm wondering which way to search for a solution. Maybe someone can give me some pointers: I have three identical (HW/SW) servers running SLES9-x86_64 SP2 and Oracle RAC. On two of these I have had kernel oopses recently. The machines are HP ProLiant DL585 G1 with 12 GB RAM. For more info see below. Oopses on dezulcamrc01: The first oops occured (non-fatal) on 16:00:06, the second oops occured at 18:00:04 and probably hung the machine so I only have the final console-output (nothing in the syslog since that moment): Final console-output: ---------------------------------------------------------------------- dezulcamrc01:~ # Message from syslogd@dezulcamrc01 at Mon Dec 25 16:00:08 2006 ... dezulcamrc01 kernel: Oops: 0000 [1] SMP Message from syslogd@dezulcamrc01 at Mon Dec 25 16:00:08 2006 ... dezulcamrc01 kernel: CR2: 0000003f80496680 Message from syslogd@dezulcamrc01 at Mon Dec 25 18:00:04 2006 ... dezulcamrc01 kernel: Oops: 0000 [2] SMP ---------------------------------------------------------------------- The first oops: ---------------------------------------------------------------------- Dec 25 16:00:06 dezulcamrc01 kernel: Badness in kobject_get at lib/kobject.c:457 Dec 25 16:00:06 dezulcamrc01 kernel: Dec 25 16:00:06 dezulcamrc01 kernel: Call Trace:<ffffffff8022fab6>{kobject_get+54} <ffffffff80195ae5>{do_open+581} Dec 25 16:00:06 dezulcamrc01 kernel: <ffffffff80195d3f>{blkdev_open+47} <ffffffff80189bb6>{dentry_open_it+262} Dec 25 16:00:06 dezulcamrc01 kernel: <ffffffff80189d91>{filp_open+113} <ffffffff80189e3f>{sys_open+159} Dec 25 16:00:06 dezulcamrc01 kernel: <ffffffff801107d4>{system_call+124} Dec 25 16:00:08 dezulcamrc01 kernel: Badness in kobject_get at lib/kobject.c:457 Dec 25 16:00:08 dezulcamrc01 kernel: Dec 25 16:00:08 dezulcamrc01 kernel: Call Trace:<ffffffff8022fab6>{kobject_get+54} <ffffffff80195ae5>{do_open+581} Dec 25 16:00:08 dezulcamrc01 kernel: <ffffffff80195d3f>{blkdev_open+47} <ffffffff80189bb6>{dentry_open_it+262} Dec 25 16:00:08 dezulcamrc01 kernel: <ffffffff80189d91>{filp_open+113} <ffffffff80189e3f>{sys_open+159} Dec 25 16:00:08 dezulcamrc01 kernel: <ffffffff801107d4>{system_call+124} Dec 25 16:00:08 dezulcamrc01 kernel: Unable to handle kernel paging request at 0000003f80496680 RIP: Dec 25 16:00:08 dezulcamrc01 kernel: <ffffffff8016d67d>{kfree+77} Dec 25 16:00:08 dezulcamrc01 kernel: PML4 26ffae067 PGD 0 Dec 25 16:00:08 dezulcamrc01 kernel: Oops: 0000 [1] SMP Dec 25 16:00:08 dezulcamrc01 kernel: CPU 1 Dec 25 16:00:08 dezulcamrc01 kernel: Pid: 13041, comm: pvdisplay Tainted: P U (2.6.5-7.201-smp SLES9_SP2_BRANCH-200508250620450000) Dec 25 16:00:08 dezulcamrc01 kernel: RIP: 0010:[<ffffffff8016d67d>] <ffffffff8016d67d>{kfree+77} Dec 25 16:00:08 dezulcamrc01 kernel: RSP: 0018:00000102309a5e68 EFLAGS: 00010016 Dec 25 16:00:08 dezulcamrc01 kernel: RAX: 0000003fffffc000 RBX: 00000102fc4e9710 RCX: 000000000000001a Dec 25 16:00:08 dezulcamrc01 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000001000 Dec 25 16:00:08 dezulcamrc01 kernel: RBP: 0000000000001000 R08: 0000000000000000 R09: 0000000000000000 Dec 25 16:00:08 dezulcamrc01 kernel: R10: 0000000000000006 R11: 0000000000000000 R12: 00000102fc4e9740 Dec 25 16:00:08 dezulcamrc01 kernel: R13: 00000102fc4e9740 R14: 0000000000000000 R15: 0000000000000000 Dec 25 16:00:08 dezulcamrc01 kernel: FS: 0000002a95a994c0(0000) GS:ffffffff80562f00(0000) knlGS:00000000556c2800 Dec 25 16:00:08 dezulcamrc01 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Dec 25 16:00:08 dezulcamrc01 kernel: CR2: 0000003f80496680 CR3: 0000000006449000 CR4: 00000000000006e0 Dec 25 16:00:08 dezulcamrc01 kernel: Process pvdisplay (pid: 13041, threadinfo 00000102309a4000, task 0000010103ae2b00) Dec 25 16:00:08 dezulcamrc01 kernel: Stack: 0000000000000206 00000102fc4e9710 00000102fc4e9740 ffffffff8022f9ea Dec 25 16:00:08 dezulcamrc01 kernel: 000001004adf4138 000001004adf4080 ffffffffa0029480 000001004adf4098 Dec 25 16:00:08 dezulcamrc01 kernel: 00000102ffcdd600 ffffffff801954bb Dec 25 16:00:08 dezulcamrc01 kernel: Call Trace:<ffffffff8022f9ea>{kobject_cleanup+74} <ffffffff801954bb>{blkdev_put+299} Dec 25 16:00:08 dezulcamrc01 kernel: <ffffffff8018d9e2>{__fput+98} <ffffffff8018970e>{filp_close+126} Dec 25 16:00:08 dezulcamrc01 kernel: <ffffffff80189815>{sys_close+229} <ffffffff801107d4>{system_call+124} Dec 25 16:00:08 dezulcamrc01 kernel: Dec 25 16:00:08 dezulcamrc01 kernel: Dec 25 16:00:08 dezulcamrc01 kernel: Code: 48 0f b6 80 80 a6 49 80 48 8b 0c c5 80 a7 49 80 48 b8 ff ff Dec 25 16:00:08 dezulcamrc01 kernel: RIP <ffffffff8016d67d>{kfree+77} RSP <00000102309a5e68> Dec 25 16:00:08 dezulcamrc01 kernel: CR2: 0000003f80496680 ---------------------------------------------------------------------- Some more details: # uname -a Linux dezulcamrc01 2.6.5-7.201-smp #1 SMP Thu Aug 25 06:20:45 UTC 2005 x86_64 x86_64 x86_64 GNU/Linux # more /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 37 model name : AMD Opteron (tm) Processor 852 stepping : 1 cpu MHz : 2399.968 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow pni bogomips : 4718.59 TLB size : 1088 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 37 model name : AMD Opteron (tm) Processor 852 stepping : 1 cpu MHz : 2399.968 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow pni bogomips : 3538.94 TLB size : 1088 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp Oopses on dezulcamrc02: I had some more oopses (52 altogether) on another one of those 3 machines two weeks ago (not necessarily related) which looked like this: Dec 11 18:00:07 dezulcamrc02 kernel: Unable to handle kernel paging request at 00000000000152c0 RIP: Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff80174264>{blk_queue_bounce+20} Dec 11 18:00:07 dezulcamrc02 kernel: PML4 dd28b067 PGD 16558067 PMD 0 Dec 11 18:00:07 dezulcamrc02 kernel: Oops: 0000 [1] SMP Dec 11 18:00:07 dezulcamrc02 kernel: CPU 0 Dec 11 18:00:07 dezulcamrc02 kernel: Pid: 21177, comm: oracle Tainted: P U (2.6.5-7.201-smp SLES9_SP2_BRANCH-200508250620450000) Dec 11 18:00:07 dezulcamrc02 kernel: RIP: 0010:[<ffffffff80174264>] <ffffffff80174264>{blk_queue_bounce+20} Dec 11 18:00:07 dezulcamrc02 kernel: RSP: 0018:000001000cc819d8 EFLAGS: 00010216 Dec 11 18:00:07 dezulcamrc02 kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 Dec 11 18:00:07 dezulcamrc02 kernel: RDX: 00000102fbbf0780 RSI: 000001000cc81a38 RDI: 0000000000015000 Dec 11 18:00:07 dezulcamrc02 kernel: RBP: 0000000000015000 R08: 000001017d4d1070 R09: 000001017d4d12c0 Dec 11 18:00:07 dezulcamrc02 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 Dec 11 18:00:07 dezulcamrc02 kernel: R13: 0000000000015000 R14: 0000000000000008 R15: 000001000cc81a38 Dec 11 18:00:07 dezulcamrc02 kernel: FS: 0000002a977ef020(0000) GS:ffffffff80562e80(0000) knlGS:00000000576d7bb0 Dec 11 18:00:07 dezulcamrc02 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Dec 11 18:00:07 dezulcamrc02 kernel: CR2: 00000000000152c0 CR3: 0000000000101000 CR4: 00000000000006e0 Dec 11 18:00:07 dezulcamrc02 kernel: Process oracle (pid: 21177, threadinfo 000001000cc80000, task 000001017d2214d0) Dec 11 18:00:07 dezulcamrc02 kernel: Stack: 000001017d4d1008 ffffff00000c8008 0000000000000000 0000000000000000 Dec 11 18:00:07 dezulcamrc02 kernel: 0000000000015000 0000000000000000 0000010027516540 0000000000000008 Dec 11 18:00:07 dezulcamrc02 kernel: 0000000000000400 ffffffff8028603a Dec 11 18:00:07 dezulcamrc02 kernel: Call Trace:<ffffffff8028603a>{__make_request+74} <ffffffffa01db714>{:emcp:PowerPlatformBottomDispatch+180} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff8013ceb0>{autoremove_wake_function+0} <ffffffffa01dd754>{:emcp:PowerTopDispatch+612} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffffa01dd92f>{:emcp:emcp_pseudo_mrf+79} <ffffffff80284aba>{generic_make_request+394} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff80192cad>{__bio_add_page+157} <ffffffff80284be0>{submit_bio+272} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff801b2ebf>{dio_bio_add_page+31} <ffffffff801b322b>{dio_bio_submit+107} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff801b43e0>{__blockdev_direct_IO+2736} <ffffffff80195045>{blkdev_direct_IO+69} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff80194c20>{blkdev_get_blocks+0} <ffffffff80165d6a>{generic_file_direct_IO+154} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff80165fc4>{__generic_file_aio_read+228} <ffffffff8016624b>{generic_file_read+187} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffffa04c7151>{:raw:raw_open+209} <ffffffff801969b2>{chrdev_open+418} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff8016cb90>{file_ra_state_init+32} <ffffffff8013ceb0>{autoremove_wake_function+0} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff8018d234>{vfs_read+244} <ffffffff8018d38c>{sys_pread64+236} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff801107d4>{system_call+124} Dec 11 18:00:07 dezulcamrc02 kernel: Dec 11 18:00:07 dezulcamrc02 kernel: Code: f6 87 c0 02 00 00 01 75 23 48 8b 05 dc 19 3c 00 48 39 87 b8 Dec 11 18:00:07 dezulcamrc02 kernel: RIP <ffffffff80174264>{blk_queue_bounce+20} RSP <000001000cc819d8> Dec 11 18:00:07 dezulcamrc02 kernel: CR2: 00000000000152c0 Dec 11 18:00:07 dezulcamrc02 kernel: <1>Unable to handle kernel NULL pointer dereference at 0000000000000474 RIP: Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff80287f90>{show_partition+112} Dec 11 18:00:07 dezulcamrc02 kernel: PML4 2f23e8067 PGD 2f2fa2067 PMD 0 Dec 11 18:00:07 dezulcamrc02 kernel: Oops: 0000 [2] SMP Dec 11 18:00:07 dezulcamrc02 kernel: CPU 1 Dec 11 18:00:07 dezulcamrc02 kernel: Pid: 20276, comm: mlragent Tainted: P U (2.6.5-7.201-smp SLES9_SP2_BRANCH-200508250620450000) Dec 11 18:00:07 dezulcamrc02 kernel: RIP: 0010:[<ffffffff80287f90>] <ffffffff80287f90>{show_partition+112} Dec 11 18:00:07 dezulcamrc02 kernel: RSP: 0018:000001025878de28 EFLAGS: 00010287 Dec 11 18:00:07 dezulcamrc02 kernel: RAX: 00000000000004ec RBX: 00000100dd5cf900 RCX: 00000000000004ec Dec 11 18:00:07 dezulcamrc02 kernel: RDX: 0000000000000424 RSI: 0000000000000424 RDI: 00000100dd5cf900 Dec 11 18:00:07 dezulcamrc02 kernel: RBP: 0000000000000424 R08: 00000000ffffffff R09: 0000000000000006 Dec 11 18:00:07 dezulcamrc02 kernel: R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000 Dec 11 18:00:07 dezulcamrc02 kernel: R13: 00000100dd5cf900 R14: 00000000000003fc R15: 00000100dd5cf928 Dec 11 18:00:07 dezulcamrc02 kernel: FS: 0000002a977ef020(0000) GS:ffffffff80562f00(005b) knlGS:000000005bfa8bb0 Dec 11 18:00:07 dezulcamrc02 kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 Dec 11 18:00:07 dezulcamrc02 kernel: CR2: 0000000000000474 CR3: 0000000006449000 CR4: 00000000000006e0 Dec 11 18:00:07 dezulcamrc02 kernel: Process mlragent (pid: 20276, threadinfo 000001025878c000, task 000001001f995640) Dec 11 18:00:07 dezulcamrc02 kernel: Stack: 00000000327a6473 0000000000000212 0000000000000212 000001001f995640 Dec 11 18:00:07 dezulcamrc02 kernel: 00000100dd5cf900 0000000000000424 0000000000000000 000000000000016e Dec 11 18:00:07 dezulcamrc02 kernel: 00000000000003fc ffffffff801aea83 Dec 11 18:00:07 dezulcamrc02 kernel: Call Trace:<ffffffff801aea83>{seq_read+451} <ffffffff8018d234>{vfs_read+244} Dec 11 18:00:07 dezulcamrc02 kernel: <ffffffff8018d48d>{sys_read+157} <ffffffff80124fe1>{cstar_do_call+27} Dec 11 18:00:07 dezulcamrc02 kernel: Dec 11 18:00:07 dezulcamrc02 kernel: Dec 11 18:00:07 dezulcamrc02 kernel: Code: 48 8b 55 50 48 85 d2 0f 84 c3 00 00 00 83 7d 08 01 75 0d 8b Dec 11 18:00:07 dezulcamrc02 kernel: RIP <ffffffff80287f90>{show_partition+112} RSP <000001025878de28> Dec 11 18:00:07 dezulcamrc02 kernel: CR2: 0000000000000474 ... ... 49 more oopses removed ... Dec 11 18:40:05 dezulcamrc02 kernel: <1>Unable to handle kernel NULL pointer dereference at 0000000000000474 RIP: Dec 11 18:40:05 dezulcamrc02 kernel: <ffffffff80287f90>{show_partition+112} Dec 11 18:40:05 dezulcamrc02 kernel: PML4 3e51a067 PGD 518dd067 PMD 0 Dec 11 18:40:05 dezulcamrc02 kernel: Oops: 0000 [52] SMP Dec 11 18:40:05 dezulcamrc02 kernel: CPU 0 Dec 11 18:40:05 dezulcamrc02 kernel: Pid: 32002, comm: grep Tainted: P U (2.6.5-7.201-smp SLES9_SP2_BRANCH-200508250620450000) Dec 11 18:40:05 dezulcamrc02 kernel: RIP: 0010:[<ffffffff80287f90>] <ffffffff80287f90>{show_partition+112} Dec 11 18:40:05 dezulcamrc02 kernel: RSP: 0018:00000100e1dc7e28 EFLAGS: 00010287 Dec 11 18:40:05 dezulcamrc02 kernel: RAX: 00000000000004ec RBX: 00000101a1f30280 RCX: 00000000000004ec Dec 11 18:40:05 dezulcamrc02 kernel: RDX: 0000000000000424 RSI: 0000000000000424 RDI: 00000101a1f30280 Dec 11 18:40:05 dezulcamrc02 kernel: RBP: 0000000000000424 R08: 00000000ffffffff R09: 0000000000000006 Dec 11 18:40:05 dezulcamrc02 kernel: R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000000 Dec 11 18:40:05 dezulcamrc02 kernel: R13: 00000101a1f30280 R14: 0000000000008000 R15: 00000101a1f302a8 Dec 11 18:40:05 dezulcamrc02 kernel: FS: 0000002a9588e700(0000) GS:ffffffff80562e80(0000) knlGS:0000000055ea1bb0 Dec 11 18:40:05 dezulcamrc02 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Dec 11 18:40:05 dezulcamrc02 kernel: CR2: 0000000000000474 CR3: 0000000000101000 CR4: 00000000000006e0 Dec 11 18:40:05 dezulcamrc02 kernel: Process grep (pid: 32002, threadinfo 00000100e1dc6000, task 00000102f38253e0) Dec 11 18:40:05 dezulcamrc02 kernel: Stack: 30630000327a6473 ffffff0031703164 0000000000000206 ffffffff801971da Dec 11 18:40:05 dezulcamrc02 kernel: 00000101a1f30280 0000000000000424 0000000000000000 0000000000000572 Dec 11 18:40:05 dezulcamrc02 kernel: 0000000000008000 ffffffff801aea83 Dec 11 18:40:05 dezulcamrc02 kernel: Call Trace:<ffffffff801971da>{cp_new_stat+234} <ffffffff801aea83>{seq_read+451} Dec 11 18:40:05 dezulcamrc02 kernel: <ffffffff8018d234>{vfs_read+244} <ffffffff8018d48d>{sys_read+157} Dec 11 18:40:05 dezulcamrc02 kernel: <ffffffff801107d4>{system_call+124} Dec 11 18:40:05 dezulcamrc02 kernel: Dec 11 18:40:05 dezulcamrc02 kernel: Code: 48 8b 55 50 48 85 d2 0f 84 c3 00 00 00 83 7d 08 01 75 0d 8b Dec 11 18:40:05 dezulcamrc02 kernel: RIP <ffffffff80287f90>{show_partition+112} RSP <00000100e1dc7e28> Dec 11 18:40:05 dezulcamrc02 kernel: CR2: 0000000000000474 This server (dezulcamrc02) I ran memtest86+ on afterwards for 5 days but without complaints. Any idea? Thanks, Kai -- Kai Groshert Technischer Consultant / Technical Consultant ITH2 Competence Center Unix PIKS Porsche-Information-Kommunikation-Services GmbH -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org