[Bug 573244] New: start of smartd causing kernel crash - clean installation not possible on fsc rx200
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c0 Summary: start of smartd causing kernel crash - clean installation not possible on fsc rx200 Classification: openSUSE Product: openSUSE 11.2 Version: Final Platform: x86 OS/Version: openSUSE 11.2 Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: devzero@web.de QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) If i install OpenSuse 11.2 on an older FSC RX200 "pizza-box", at the late installation stage the box freezes with the attached kernel oops. i tracked it down to be caused by smartd. if i force exclusion of smartmontools package at installation stage, all is fine and box runs well. if i add smartmontools package at later stage and do "/etc/init.d/smartd start", the box crashes immediately. i will provide lspci output and other information later. please request what is needed for further analysis. if the screenshot is too bad, i have a series of 4 detailed screenshots with better resolution, but i need to stitch them first to make a single image from that. Reproducible: Always Steps to Reproduce: 1. install opensuse 11.2 2. 3. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c1 --- Comment #1 from roland kletzing <devzero@web.de> 2010-01-22 22:51:16 UTC --- Created an attachment (id=338427) --> (http://bugzilla.novell.com/attachment.cgi?id=338427) dmesg mobilephone screenshot -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c2 roland kletzing <devzero@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |devzero@web.de --- Comment #2 from roland kletzing <devzero@web.de> 2010-01-23 00:37:10 UTC --- after reset, i could enter grub boot stage and choose normal or failsafe boot, but even with failsave mode, the box won`t come up when smartd is installed (which is installed by default). i would recommend that failsafe mode should at least turn off smartd because smartd is known to be able to cause issues like these. and it should not be turned on during installation but at the first boot instead so you may have a chance to easily repair the installation by booting to failsafe mode. i did not have that chance, which created hassle.... see https://bugzilla.novell.com/show_bug.cgi?id=201715 -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c3 --- Comment #3 from roland kletzing <devzero@web.de> 2010-01-25 10:28:26 UTC --- Hardware: FSC RX200 Intel Xeon 3.06Ghz 1GB RAM 2x36GB SCSI Disks HW Raid1 + 1 Disk HotSpare # lspci 00:00.0 Host bridge: Intel Corporation E7501 Memory Controller Hub (rev 01) 00:00.1 Class ff00: Intel Corporation E7500/E7501 Host RASUM Controller (rev 01) 00:04.0 PCI bridge: Intel Corporation E7500/E7501 Hub Interface D PCI-to-PCI Bridge (rev 01) 00:04.1 Class ff00: Intel Corporation E7500/E7501 Hub Interface D RASUM Controller (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB Controller #2 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 42) 00:1f.0 ISA bridge: Intel Corporation 82801CA LPC Interface Controller (rev 02) 00:1f.1 IDE interface: Intel Corporation 82801CA Ultra ATA Storage Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801CA/CAM SMBus Controller (rev 02) 01:09.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) 02:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) 02:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) 02:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) 02:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) 03:07.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID (rev 01) 04:02.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 01) 04:02.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 01) lspci -vvv 00:00.0 Host bridge: Intel Corporation E7501 Memory Controller Hub (rev 01) Subsystem: Giga-byte Technology Device 5000 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Capabilities: [40] Vendor Specific Information <?> Kernel driver in use: e7xxx_edac 00:00.1 Class ff00: Intel Corporation E7500/E7501 Host RASUM Controller (rev 01) Subsystem: Giga-byte Technology Device 5000 Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- 00:04.0 PCI bridge: Intel Corporation E7500/E7501 Hub Interface D PCI-to-PCI Bridge (rev 01) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64 Bus: primary=00, secondary=02, subordinate=04, sec-latency=0 I/O behind bridge: 0000b000-0000bfff Memory behind bridge: fe000000-feafffff Prefetchable memory behind bridge: fb300000-fbbfffff Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- 00:04.1 Class ff00: Intel Corporation E7500/E7501 Hub Interface D RASUM Controller (rev 01) Subsystem: Giga-byte Technology Device 5000 Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- 00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB Controller #1 (rev 02) (prog-if 00 [UHCI]) Subsystem: Giga-byte Technology Device 5001 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 16 Region 4: I/O ports at dc00 [size=32] Kernel driver in use: uhci_hcd 00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB Controller #2 (rev 02) (prog-if 00 [UHCI]) Subsystem: Giga-byte Technology Device 5001 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin B routed to IRQ 19 Region 4: I/O ports at d800 [size=32] Kernel driver in use: uhci_hcd 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 42) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Bus: primary=00, secondary=01, subordinate=01, sec-latency=32 I/O behind bridge: 0000a000-0000afff Memory behind bridge: fbd00000-fdffffff Prefetchable memory behind bridge: fb200000-fb2fffff Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA+ MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- 00:1f.0 ISA bridge: Intel Corporation 82801CA LPC Interface Controller (rev 02) Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 00:1f.1 IDE interface: Intel Corporation 82801CA Ultra ATA Storage Controller (rev 02) (prog-if 8a [Master SecP PriP]) Subsystem: Giga-byte Technology Device 5001 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 18 Region 0: I/O ports at 01f0 [size=8] Region 1: I/O ports at 03f4 [size=1] Region 2: I/O ports at 0170 [size=8] Region 3: I/O ports at 0374 [size=1] Region 4: I/O ports at ff00 [size=16] Region 5: Memory at 40000000 (32-bit, non-prefetchable) [size=1K] Kernel driver in use: ata_piix 00:1f.3 SMBus: Intel Corporation 82801CA/CAM SMBus Controller (rev 02) Subsystem: Giga-byte Technology Device 5001 Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin B routed to IRQ 17 Region 4: I/O ports at 0540 [size=32] Kernel driver in use: i801_smbus 01:09.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA controller]) Subsystem: Giga-byte Technology Device 5002 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64 (2000ns min), Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 9 Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M] Region 1: I/O ports at a800 [size=256] Region 2: Memory at fcf00000 (32-bit, non-prefetchable) [size=4K] Expansion ROM at fb200000 [disabled] [size=128K] Capabilities: [5c] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- 02:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) (prog-if 20 [IO(X)-APIC]) Subsystem: Giga-byte Technology Device 5000 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Region 0: Memory at fea00000 (32-bit, non-prefetchable) [size=4K] Capabilities: [50] PCI-X non-bridge device Command: DPERE- ERO- RBC=512 OST=1 Status: Dev=02:1c.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz- 02:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64, Cache Line Size: 64 bytes Bus: primary=02, secondary=04, subordinate=04, sec-latency=64 I/O behind bridge: 0000b000-0000bfff Memory behind bridge: fe100000-fe5fffff Prefetchable memory behind bridge: 00000000fb600000-00000000fbafffff Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort+ >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [50] PCI-X bridge device Secondary Status: 64bit+ 133MHz+ SCD- USC- SCO- SRD- Freq=100MHz Status: Dev=02:1d.0 64bit+ 133MHz+ SCD- USC- SCO- SRD- Upstream: Capacity=65535 CommitmentLimit=65535 Downstream: Capacity=65535 CommitmentLimit=65535 02:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) (prog-if 20 [IO(X)-APIC]) Subsystem: Giga-byte Technology Device 5000 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Region 0: Memory at fe900000 (32-bit, non-prefetchable) [size=4K] Capabilities: [50] PCI-X non-bridge device Command: DPERE- ERO- RBC=512 OST=1 Status: Dev=02:1e.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=512 DMOST=1 DMCRS=8 RSCEM- 266MHz- 533MHz- 02:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64, Cache Line Size: 64 bytes Bus: primary=02, secondary=03, subordinate=03, sec-latency=48 I/O behind bridge: 0000f000-00000fff Memory behind bridge: fe000000-fe0fffff Prefetchable memory behind bridge: 00000000fb300000-00000000fb5fffff Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort+ >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [50] PCI-X bridge device Secondary Status: 64bit+ 133MHz+ SCD- USC- SCO- SRD- Freq=conv Status: Dev=02:1f.0 64bit+ 133MHz+ SCD- USC- SCO- SRD- Upstream: Capacity=65535 CommitmentLimit=65535 Downstream: Capacity=65535 CommitmentLimit=65535 03:07.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID (rev 01) Subsystem: LSI Logic / Symbios Logic MegaRAID 520 SCSI 320-1 Controller Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 24 Region 0: Memory at fb500000 (32-bit, prefetchable) [size=64K] Expansion ROM at fe0f0000 [disabled] [size=64K] Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: megaraid 04:02.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 01) Subsystem: Giga-byte Technology Device 3000 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64 (63750ns min), Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 48 Region 0: Memory at fe500000 (64-bit, non-prefetchable) [size=128K] Region 4: I/O ports at bc00 [size=64] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] PCI-X non-bridge device Command: DPERE- ERO+ RBC=512 OST=1 Status: Dev=04:02.0 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz- Capabilities: [f0] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Kernel driver in use: e1000 04:02.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) (rev 01) Subsystem: Giga-byte Technology Device 3000 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64 (63750ns min), Cache Line Size: 64 bytes Interrupt: pin B routed to IRQ 49 Region 0: Memory at fe400000 (64-bit, non-prefetchable) [size=128K] Region 4: I/O ports at b800 [size=64] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] PCI-X non-bridge device Command: DPERE- ERO+ RBC=512 OST=1 Status: Dev=04:02.1 64bit+ 133MHz+ SCD- USC- DC=simple DMMRBC=2048 DMOST=1 DMCRS=16 RSCEM- 266MHz- 533MHz- Capabilities: [f0] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Kernel driver in use: e1000 -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c4 --- Comment #4 from roland kletzing <devzero@web.de> 2010-01-25 11:05:52 UTC --- i could also reproduce the problem with kernel-vanilla-2.6.31.12-0.0.0.9.ebebaea.i586.rpm (kotd for 11.2) kernel-vanilla-2.6.32.5-0.0.3.f89b2ba.i586.rpm (kotd for HEAD) so it appears to be an upstream issue and not novell/suse specific !? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c5 --- Comment #5 from roland kletzing <devzero@web.de> 2010-01-27 07:59:35 UTC --- i already contacted upstream myself -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c6 --- Comment #6 from roland kletzing <devzero@web.de> 2010-01-27 14:10:26 UTC --- upstream discussion on linux-scsi mailinglist:
Running 'smartd -r ioctl,1' from the command line might help us find which command it didn't like. That will send information to the log. If the crash truncates the log then add '-d' to the above invocation to dump debug information to the console.
thanks! here we go: smartd -r ioctl,1 -d smartd 5.39 2009-08-08 r2872~ [i686-pc-linux-gnu] (openSUSE RPM) Copyright (C) 2002-9 by Bruce Allen, http://smartmontools.sourceforge.net Opened configuration file /etc/smartd.conf Drive: DEVICESCAN, implied '-a' Directive on line 26 of file /etc/smartd.conf Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices glob(3) found no matches for pattern /dev/hd[a-t] [inquiry: 12 00 00 00 24 00 ] scsi_status=0x0, host_status=0x0, driver_status=0x0 info=0x0 duration=0 milliseconds resid=0 status=0x0 Device: /dev/sda, opened [test unit ready: 00 00 00 00 00 00 ] scsi_status=0x0, host_status=0x0, driver_status=0x0 info=0x0 duration=0 milliseconds resid=0 status=0x0 [mode sense(6): 1a 00 1c 00 40 00 ] Killed [box freezed here]
Information about the disks would be useful as well.
3x MAP 3367NC-V4 , 36GB - 2 Disks Raid1, 1 Disk Hotspare
Does the megaraid controller have the latest firmware from LSI?
it didn`t (had 1F30) but it has 1F56 now. ( did an update with latest available from \ http://support.ts.fujitsu.com/Download/ShowDescription.asp?SoftwareGUID=873E... \ 4EB1-B48E-D8E9C9929EA4&OSID=BD62ED52-DB04-417E-8A72-A3BA96295B00&Status=True&Component \ =MegaRAID%20SCSI%20U320-1 ) problem still exists with the latest firmware. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c7 Hannes Reinecke <hare@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hare@novell.com AssignedTo|kernel-maintainers@forge.pr |hare@novell.com |ovo.novell.com | --- Comment #7 from Hannes Reinecke <hare@novell.com> 2010-02-01 09:10:55 UTC --- I check if we can reproduce it here. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c10 Hannes Reinecke <hare@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |devzero@web.de --- Comment #10 from Hannes Reinecke <hare@novell.com> 2010-02-03 08:55:23 UTC --- Sorry, but the RX200 we have here is not using the Megaraid controller. Would it be possible to capture the entire dump? Seeing that it's a FSC machine I bet IPMI is working, so it should be possible to capture the kernel messages via IPMI Serial-over-Lan. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c11 roland kletzing <devzero@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|devzero@web.de | --- Comment #11 from roland kletzing <devzero@web.de> 2010-02-03 11:31:03 UTC --- i found some SOL and BMC configuration options in the Bios and set an IP-adress for the BMC. I never used that before and neither can`t ping the BMC nor do i really know how to proceed. spend half an hour with this now and could need a helping hand. what do i need to do (or to check) so that i can capture the boot messages? i could imagine it goes like this: 1. configure bmc in bios to have dedicated ip adress (->done) 2. configure SOL in bios so serial port1 or port2 would map to a network port (->done) 3. plug network cable into onboard nic1/nic2 (tried both). or would i need a special network port/bmc addon-card to be plugged in? there are just the 2 onboard nic`s 4. check connectivity to the BMC (-> how? ping to BMC ip does NOT work) 5. connect to the BMC/SOL (->how?) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c12 Daniel Rahn <daniel.rahn@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |devzero@web.de --- Comment #12 from Daniel Rahn <daniel.rahn@novell.com> 2010-02-03 11:37:34 UTC --- The problem with giving you setup instructions for your machine right now is that I am not a 100 percent sure which model you are really using. Can you boot an older openSUSE release on the system, run dmidecode and provide that output? Thanks. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c13 --- Comment #13 from roland kletzing <devzero@web.de> 2010-02-03 12:07:31 UTC --- i actually CAN boot 11.2 as i disabled smartmontools package at installation time - so i can send dmidecode from this 11.2 system - ok ? will try getting a nullmodem cable for capturing the kernel output -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c14 --- Comment #14 from Daniel Rahn <daniel.rahn@novell.com> 2010-02-03 12:09:24 UTC --- Yes, dmidecode output from 11.2 is fine. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c15 --- Comment #15 from Hannes Reinecke <hare@novell.com> 2010-02-03 13:39:30 UTC --- For the serial-over-lan stuff:
From a running system you should check if IPMI is configured correctly:
/etc/init.d/ipmi start ipmitool user list Pick one user by ID and set the password for it: ipmitool user set <ID> password <password> Check which channel is the network channel: ipmitool channel info <n> where <n> is the channel number. I haven't found any way to list all possible channels, so you'll have to step through the channels by hand. One channel should give you an output with Medium type '802.3 LAN', that's the one you're looking for. Then you can check the LAN settings by ipmitool lan print <n> that will print out the current network configuration settings, most notably the IP address. Now you can use another machine and contact the IPMI from there: ipmitool -H <ipaddress> -U <username> -P <password> power status If you get a message like: Chassis Power is on then IPMI is configured properly. Then you should be able to do something like ipmitool -H <ipaddress> -U <username> -P <password> -I lanplus sol activate and you should be seeing the output of the serial-over-LAN connection. Famous last words. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c16 --- Comment #16 from roland kletzing <devzero@web.de> 2010-02-03 17:35:41 UTC --- Created an attachment (id=340523) --> (http://bugzilla.novell.com/attachment.cgi?id=340523) dmidecode output - ascii & binary -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c17 roland kletzing <devzero@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|devzero@web.de | --- Comment #17 from roland kletzing <devzero@web.de> 2010-02-03 17:36:41 UTC --- thanks for the receipe. unfortunately i cannot try that before monday when i`m back here. i added the requested dmidecode output -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c18 --- Comment #18 from Daniel Rahn <daniel.rahn@novell.com> 2010-02-03 17:42:45 UTC --- Thanks for the information. So it's a real first generation RX200 with the original board D1570. I will check to see if I can find some information regarding the BMC. It should btw. ping if setup correctly. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c19 --- Comment #19 from roland kletzing <devzero@web.de> 2010-02-03 19:07:14 UTC --- one more info: i tried booting grml live cd today, which has kernel 2.6.31 and smartmontools 5.39 and that did NOT crash. starting smartd told something like "please add megaraid option as this is a dell/megaraid box. for my own surprise i see that in my post smartd tells it`s version 5.39 already, but i could bet i have 5.38 package (which came with 11.2) and factory also has 5.38. will double check on monday. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c20 --- Comment #20 from roland kletzing <devzero@web.de> 2010-02-09 11:28:09 UTC --- i don`t get ipmi working - no ping - and more time to fiddle around with that. your description is good and i think i understand the concept and the handling - but something seems to go wrong here.... :( anyway - regarding the version numbers - apparently the smartmontools 5.38 packages has smartd/smartmontool at version 5.39. rpm -q -a |grep smart smartmontools-5.38.0.20090808-3.20.i586 ftp:~ # smartd -V smartd 5.39 2009-08-08 r2872~ [i686-pc-linux-gnu] (local build) Copyright (C) 2002-9 by Bruce Allen, http://smartmontools.sourceforge.net smartd comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under the terms of the GNU General Public License Version 2. See http://www.gnu.org for further details. smartmontools release 5.39 dated 2008/03/10 at 10:44:07 GMT smartmontools SVN rev 2872~ dated 2009-08-08 at 19:54:28 smartmontools build host: i686-pc-linux-gnu smartmontools build configured: 2010/01/16 09:43:15 UTC smartd compile dated Jan 16 2010 at 09:44:00 smartmontools configure arguments: '--host=i686-pc-linux-gnu' '--build=i686-pc-linux-gnu' '--target=i586-suse-linux' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/lib' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-docdir=/usr/share/doc/packages/smartmontools' '--with-selinux' '--ftp:~ # smartctl -V smartctl 5.39 2009-08-08 r2872~ [i686-pc-linux-gnu] (local build) Copyright (C) 2002-9 by Bruce Allen, http://smartmontools.sourceforge.net smartctl comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under the terms of the GNU General Public License Version 2. See http://www.gnu.org for further details. smartmontools release 5.39 dated 2008/03/10 at 10:44:07 GMT smartmontools SVN rev 2872~ dated 2009-08-08 at 19:54:28 smartmontools build host: i686-pc-linux-gnu smartmontools build configured: 2010/01/16 09:43:15 UTC smartctl compile dated Jan 16 2010 at 09:44:00 smartmontools configure arguments: '--host=i686-pc-linux-gnu' '--build=i686-pc-linux-gnu' '--target=i586-suse-linux' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/lib' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-docdir=/usr/share/doc/packages/smartmontools' '--with-selinux' '-- cannot hold this box out of production for very long anymore...... -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c21 --- Comment #21 from roland kletzing <devzero@web.de> 2010-02-23 18:49:36 UTC --- for reference : http://marc.info/?l=linux-kernel&m=126686985804305&w=2 -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c22 Hannes Reinecke <hare@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #22 from Hannes Reinecke <hare@novell.com> 2010-03-30 13:44:28 UTC --- There actually is a fix for megaraid_mbox wrt MODE SENSE handling; cf bug#475619. And it's not included in OpenSUSE-11.2 :-( -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c23 --- Comment #23 from Hannes Reinecke <hare@novell.com> 2010-03-30 13:48:05 UTC --- Created an attachment (id=351428) --> (http://bugzilla.novell.com/attachment.cgi?id=351428) megaraid-mbox-fix-SG_IO Fix kernel oops on SG_IO with megaraid_mbox driver. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c24 Hannes Reinecke <hare@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |devzero@web.de --- Comment #24 from Hannes Reinecke <hare@novell.com> 2010-03-30 13:49:13 UTC --- Please test with the above patch. I have already committed it to our kernel repository, so any of the new KOTD builds at ftp://ftp.suse.com/pub/projects/kernel/kotd will have it. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c25 --- Comment #25 from roland kletzing <devzero@web.de> 2010-04-07 14:53:15 UTC --- thanks. you`re lucky that i could test it, as this server is already in production, but i had a timeslot for maintenance today and dared installing a second kernel, risking to trash the system. don`t tell my chief, shhhh ;) anyway, thanks for the patch. it seems to fix the issue to some degree, but it seems it`s not the perfect solution, as the behaviour looks racy now, i.e. i could crash the system reliably before - now it only crashes sporadically on smartd start. so the fix seems to point into the right direction. this is what smartd -d MAY respond with the patch applied: smartd 5.39 2009-08-08 r2872~ [i686-pc-linux-gnu] (openSUSE RPM) Copyright (C) 2002-9 by Bruce Allen, http://smartmontools.sourceforge.net Opened configuration file /etc/smartd.conf Drive: DEVICESCAN, implied '-a' Directive on line 26 of file /etc/smartd.conf Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices glob(3) found no matches for pattern /dev/hd[a-t] Device: /dev/sda, opened Device: /dev/sda, Bad IEC (SMART) mode page, err=5, skip device Unable to register SCSI device /dev/sda at line 26 of file /etc/smartd.conf Unable to monitor any SMART enabled devices. Try debug (-d) option. Exiting... running several "smartd -d" one after the other still crashes the system. running smartd -d in a loop reliably crashes the system , mostly within 10 or 20 execs of smartd. regards Roland -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c26 roland kletzing <devzero@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|devzero@web.de | --- Comment #26 from roland kletzing <devzero@web.de> 2010-04-07 14:54:22 UTC --- forgot removing needinfo. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c27 --- Comment #27 from Hannes Reinecke <hare@novell.com> 2010-05-25 13:35:01 UTC --- Finally found the most likely reason here; smard is sending out invalid commands (cf bug#606693). Sadly my C++ knowledge isn't deep enough to figure out what's going wrong there. You're welcomed to try, though. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c28 --- Comment #28 from roland kletzing <devzero@web.de> 2010-05-25 13:56:51 UTC --- thanks - but 2 problems: 1st: the system is out of my hands now. maybe it will return in some months and i can have a look.... 2nd: i`m getting "You are not authorized to access bug #606693". do you have more details or can post the related information here? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=573244 http://bugzilla.novell.com/show_bug.cgi?id=573244#c29 --- Comment #29 from Hannes Reinecke <hare@novell.com> 2010-05-26 06:50:09 UTC --- You should be authorized now; I've found you on the CC list there. But basically the problem is that smartd is sending down ATA-passthrough commands if it detects an ATA drive. Which isn't a problem per se as (surprise, surprise) all ATA drives understand ATA commands. However, smartd fails to encapsulate the ATA command in a proper SCSI passthrough command, sending down raw ATA commands. Which will be interpreted as SCSI commands causing random garbage. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=573244 https://bugzilla.novell.com/show_bug.cgi?id=573244#c30 Swamp Workflow Management <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard| |maint:released:11.2:35929 --- Comment #30 from Swamp Workflow Management <swamp@suse.com> 2010-09-23 13:11:09 UTC --- Update released for: kernel-debug, kernel-debug-base, kernel-debug-base-debuginfo, kernel-debug-debuginfo, kernel-debug-debugsource, kernel-debug-devel, kernel-debug-devel-debuginfo, kernel-default, kernel-default-base, kernel-default-base-debuginfo, kernel-default-debuginfo, kernel-default-debugsource, kernel-default-devel, kernel-default-devel-debuginfo, kernel-desktop, kernel-desktop-base, kernel-desktop-base-debuginfo, kernel-desktop-debuginfo, kernel-desktop-debugsource, kernel-desktop-devel, kernel-desktop-devel-debuginfo, kernel-pae, kernel-pae-base, kernel-pae-base-debuginfo, kernel-pae-debuginfo, kernel-pae-debugsource, kernel-pae-devel, kernel-pae-devel-debuginfo, kernel-source, kernel-source-vanilla, kernel-syms, kernel-trace, kernel-trace-base, kernel-trace-base-debuginfo, kernel-trace-debuginfo, kernel-trace-debugsource, kernel-trace-devel, kernel-trace-devel-debuginfo, kernel-vanilla, kernel-vanilla-base, kernel-vanilla-base-debuginfo, kernel-vanilla-debuginfo, kernel-vanilla-debugsource, kernel-vanilla-devel, kernel-vanilla-devel-debuginfo, kernel-xen, kernel-xen-base, kernel-xen-base-debuginfo, kernel-xen-debuginfo, kernel-xen-debugsource, kernel-xen-devel, kernel-xen-devel-debuginfo, preload-kmp-default, preload-kmp-desktop Products: openSUSE 11.2 (debug, i586, x86_64) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=573244 https://bugzilla.novell.com/show_bug.cgi?id=573244#c31 Hannes Reinecke <hare@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|hare@novell.com |sbrabec@novell.com --- Comment #31 from Hannes Reinecke <hare@novell.com> 2010-09-30 11:25:30 UTC --- Reassigning to maintainer of smartd. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=573244 https://bugzilla.novell.com/show_bug.cgi?id=573244#c32 Stanislav Brabec <sbrabec@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |devzero@web.de --- Comment #32 from Stanislav Brabec <sbrabec@novell.com> 2010-10-05 14:36:55 CEST --- I am trying to identify, whether this problem is already fixed in the upstream. Could you test the latest smartmontools packagefrom http://ftp.suse.com/pub/people/sbrabec/smartmontools (they are built for SLES11 but it should be no problem)? Does it fix these problems (no crash with a kernel that is sensitive to this bug or disappearing of error messages on fixed kernels)? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=573244 https://bugzilla.novell.com/show_bug.cgi?id=573244#c33 Stanislav Brabec <sbrabec@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED InfoProvider|devzero@web.de | Resolution| |NORESPONSE --- Comment #33 from Stanislav Brabec <sbrabec@novell.com> 2011-07-11 16:34:15 CEST --- No response in 8 months. I cannot do anything with this bug without help of the reporter. It you could provide tests on the openSUSE 11.4 with the latest smartmontools, please reopen and provide test results. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com