[Bug 640368] New: nagios check_ide_smart does not work
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c0 Summary: nagios check_ide_smart does not work Classification: openSUSE Product: openSUSE 11.3 Version: Final Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Network AssignedTo: muhamed.memovic@novell.com ReportedBy: msvec@novell.com QAContact: qa@suse.de CC: lrupp@novell.com, schneemann@b1-systems.de Found By: --- Blocker: --- On some disks check_ide_smart plugin does not work: bash# /usr/lib/nagios/plugins/check_ide_smart /dev/sda CRITICAL - SMART_ENABLE: Invalid argument CRITICAL - SMART_CMD_ENABLE Despite they support SMART and can report the status: bash# smartctl --all /dev/sda smartctl 5.39.1 2010-01-28 r3054 [i686-pc-linux-gnu] (openSUSE RPM) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Device: IBM-PSG ST336704LW !# Version: B232 Serial number: 3CD1J523000071291FAP Device type: disk Local Time is: Sun Sep 19 01:02:57 2010 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 62 C Drive Trip Temperature: 65 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 4140202538 Blocks received from initiator = 3532815560 Blocks read from cache and sent to initiator = 644869656 Number of read and write commands whose size <= segment size = 668203733 Number of read and write commands whose size > segment size = 1720511 Vendor (Seagate/Hitachi) factory information number of hours powered up = 59360.15 number of minutes until next internal SMART test = 52 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 84329 0 0 84329 84329 6377.347 0 write: 0 0 0 0 0 6409.092 0 verify: 476 0 0 476 476 110.624 0 Non-medium error count: 50 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 4 - [- - -] # 2 Background short Completed - 4 - [- - -] # 3 Background short Completed - 4 - [- - -] # 4 Background short Completed - 4 - [- - -] # 5 Background short Completed - 4 - [- - -] # 6 Background short Completed - 4 - [- - -] Long (extended) Self Test duration: 1350 seconds [22.5 minutes] -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c Michal Svec <msvec@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |msvec@novell.com AssignedTo|mem@novell.com |lrupp@suse.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c1 Lars Vogdt <lrupp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |msvec@novell.com --- Comment #1 from Lars Vogdt <lrupp@suse.com> 2011-07-22 12:47:38 CEST --- Can you provide some hosts for testing? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c3 --- Comment #3 from Michal Svec <msvec@novell.com> 2011-07-22 13:19:23 CEST --- BTW this is what strace says (if it's related): open("/dev/sda", O_RDONLY|O_LARGEFILE) = 3 ioctl(3, 0x31f, 0xbfbf902c) = -1 EINVAL (Invalid argument) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c4 Lars Vogdt <lrupp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium Status|NEW |ASSIGNED URL| |http://sourceforge.net/trac | |ker/?func=detail&atid=39759 | |7&aid=3343431&group_id=2988 | |0 Found By|--- |Development --- Comment #4 from Lars Vogdt <lrupp@suse.com> 2011-08-01 22:05:47 CEST --- Just for reference: upstream URL: http://sourceforge.net/tracker/?func=detail&atid=397597&aid=3343431&group_id=29880 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c5 --- Comment #5 from Michal Svec <msvec@suse.com> 2012-01-26 13:42:56 CET --- It's still the same in openSUSE 12.1 and nagios-plugins-1.4.15-8.1.2.i586 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c6 --- Comment #6 from Lars Vogdt <lrupp@suse.com> 2012-02-24 16:34:46 CET ---
From check_ide_smart.c (nagios-plugins-1.4.15):
int smart_cmd_simple (int fd, enum SmartCommand command, __u8 val0, char show_error) { int e = 0; __u8 args[4]; args[0] = WIN_SMART; args[1] = val0; args[2] = smart_command[command].value; args[3] = 0; if (ioctl (fd, HDIO_DRIVE_CMD, &args)) { e = errno; if (show_error) { printf (_("CRITICAL - %s: %s\n"), smart_command[command].text, strerror (errno)); } } return e; } [...] if (smart_cmd_simple (fd, SMART_CMD_ENABLE, 0, TRUE)) { printf (_("CRITICAL - SMART_CMD_ENABLE\n")); return STATE_CRITICAL; } [...] --------------------------------- from os_linux.cpp (smartmontools-5.42): [...] buff[2]=ATA_SMART_ENABLE; buff[1]=1; break; [...] if ((retval=ioctl(get_fd(), HDIO_DRIVE_TASK, buff))) { if (retval==-EINVAL) { pout("Error SMART Status command via HDIO_DRIVE_TASK failed"); pout("Rebuild older linux 2.2 kernels with HDIO_DRIVE_TASK support added\n"); } else syserror("Error SMART Status command failed"); return -1; } --------------------------------- Looks like the problem is the '0' (check_ide_smart) vs. '1' (os_linux.cpp) for the second value of the drive command. I'm preparing a test package... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c7 Lars Vogdt <lrupp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |msvec@suse.com --- Comment #7 from Lars Vogdt <lrupp@suse.com> 2012-02-27 15:49:07 CET --- I added the following patch now: Index: plugins/check_ide_smart.c =================================================================== --- plugins/check_ide_smart.c.orig +++ plugins/check_ide_smart.c @@ -230,7 +230,7 @@ main (int argc, char *argv[]) return STATE_CRITICAL; } - if (smart_cmd_simple (fd, SMART_CMD_ENABLE, 0, TRUE)) { + if (smart_cmd_simple (fd, SMART_CMD_ENABLE, 1, TRUE)) { printf (_("CRITICAL - SMART_CMD_ENABLE\n")); return STATE_CRITICAL; } and tested it on my own machine with no difference: the old and the new check_ide_smart plugin returns correct values for my harddisk. Can you please try to test it on your machine? http://download.opensuse.org/repositories/home:/lrupp:/branches:/server:/mon... contains the package with the fix. Last changelog entry for nagios-plugins from that repo: ------------------------------------------------------------------- Mon Feb 27 14:43:01 UTC 2012 - lars@linux-schulserver.de - added nagios-plugins-1.4.15-check_ide_smart.patch : trying to fix bnc#640368 - run %%set_permissions for newer distributions ------------------------------------------------------------------- -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c8 Michal Svec <msvec@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|msvec@suse.com | --- Comment #8 from Michal Svec <msvec@suse.com> 2012-02-27 19:58:32 CET --- bash# /usr/lib/nagios/plugins/check_ide_smart -d /dev/sda CRITICAL - SMART_ENABLE: Invalid argument CRITICAL - SMART_CMD_ENABLE bash# /tmp/check_ide_smart -d /dev/sda CRITICAL - SMART_ENABLE: Invalid argument CRITICAL - SMART_CMD_ENABLE The latter one is from nagios-plugins-1.4.15-37.1.i586.rpm -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c9 --- Comment #9 from Lars Vogdt <lrupp@suse.com> 2012-02-28 11:21:49 CET --- Sad :-( Instead of trying to fix the C-Code, I like to rewrite the script and use a simpler approach with another, already available similar check that works. I need to adapt the commandline options for that check - I'll ping you once this is done. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c10 --- Comment #10 from Michal Svec <msvec@suse.com> 2013-08-13 15:02:35 CEST --- The server which exhibited this behavior died meanwhile so unless there's some more push we could close this. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=640368 https://bugzilla.novell.com/show_bug.cgi?id=640368#c11 Lars Vogdt <lrupp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |CLOSED Resolution| |WONTFIX --- Comment #11 from Lars Vogdt <lrupp@suse.com> 2013-08-30 15:48:56 CEST --- (In reply to comment #10)
The server which exhibited this behavior died meanwhile so unless there's some more push we could close this.
As I can not find another machine for testing, I will close the bug for now. I hope the "new" upstream maintainers find someone who like to work on the plugin - or decide to drop it and go for something like the perl script that is contained in our "nagios-plugins-smart" package already. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com