On 10/21/18 1:44 PM, Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi,
I have one disk that is giving me problems with the smartd daemon. I get this in the log:
<3.6> 2018-10-21T13:45:23.829155+02:00 Isengard smartd 1173 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], state written to /var/lib/smartmontools/smartd.WDC_WD80EZAZ_11TDBA0-2TKST2SD.ata.state <3.6> 2018-10-21T13:45:24.483719+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], opened <3.6> 2018-10-21T13:45:24.484570+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], WDC WD80EZAZ-11TDBA0, S/N:2TKST2SD, WWN:5-000cca-26af51579, FW:83.H0A83, 8.00 TB <3.6> 2018-10-21T13:45:24.503334+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], not found in smartd database. <3.6> 2018-10-21T13:45:24.525071+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], enabled SMART Attribute Autosave. <3.6> 2018-10-21T13:45:24.530486+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], enabled SMART Automatic Offline Testing. <3.6> 2018-10-21T13:45:24.535003+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], is SMART capable. Adding to "monitor" list. <3.6> 2018-10-21T13:45:24.535627+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], state read from /var/lib/smartmontools/smartd.WDC_WD80EZAZ_11TDBA0-2TKST2SD.ata.state <3.6> 2018-10-21T13:45:24.880219+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], state written to /var/lib/smartmontools/smartd.WDC_WD80EZAZ_11TDBA0-2TKST2SD.ata.state <3.6> 2018-10-21T14:15:25.233525+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 147 to 144 <3.6> 2018-10-21T15:45:31.681938+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], not capable of SMART self-check <3.2> 2018-10-21T15:45:33.632399+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], failed to read SMART Attribute Data <3.6> 2018-10-21T16:15:24.678100+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], read SMART Attribute Data worked again, warning condition reset after 1 email <3.6> 2018-10-21T18:15:31.767150+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], not capable of SMART self-check <3.2> 2018-10-21T18:15:33.717688+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], failed to read SMART Attribute Data <3.6> 2018-10-21T18:45:24.587304+02:00 Isengard smartd 11255 - - Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], read SMART Attribute Data worked again, warning condition reset after 1 email
It intermitently but periodically fail to read atributes, triggering hundreds of emails sent to me to warn of the problem:
+++------------ Subject: SMART error (FailedReadSmartData) detected on host: Isengard
This message was generated by the smartd daemon running on:
host name: Isengard DNS domain: valinor
The following warning/error was logged by the smartd daemon:
Device: /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 [SAT], failed to read SMART Attribute Data
Device info: WDC WD80EZAZ-11TDBA0, S/N:2TKST2SD, WWN:5-000cca-26af51579, FW:83.H0A83, 8.00 TB
For details see host's SYSLOG.
You can also use the smartctl utility for further investigation. Another message will be sent in 24 hours if the problem persists. - ------------++-
The disk is indeed smart capable and it works fine, as long as I call smartctl with "-d sat,16", which I do:
Isengard:~ # smartctl --test=short -d sat,16 /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0\:0 smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.4.155-68-default] (SUSE RPM) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Short self-test routine immediately in off-line mode". Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 2 minutes for test to complete. Test will complete after Sun Oct 21 13:50:18 2018
Use smartctl -X to abort test. Isengard:~ #
Isengard:~ # smartctl --health -d sat,16 /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0\:0 smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.4.155-68-default] (SUSE RPM) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
Isengard:~ #
It is crucial to use "-d sat,16" or it fails:
Isengard:~ # smartctl --health /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0\:0 smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.4.155-68-default] (SUSE RPM) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
/dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0: Unknown USB bridge [0x1058:0x25ee (0x4004)] Please specify device type with the -d option.
Use smartctl -h to get a usage summary
Isengard:~ #
Of course I use that option on the config:
Isengard:~ # cat /etc/smartd.conf | egrep -v "^[[:space:]]*$|^#" /dev/sda -a -o on -S on -s (S/../.././02|L/../../6/03) -m root@telcontar.valinor /dev/disk/by-id/wwn-0x5000000000000001 -a -o on -S on -s (S/../.././02|L/../../6/03) -m root@telcontar.valinor /dev/disk/by-id/wwn-0x5000c5009399305f -a -o on -S on -s (S/../.././02|L/../../6/03) -m root@telcontar.valinor /dev/disk/by-id/usb-WD_My_Book_25EE_32544B5354325344-0:0 -d sat,16 -a -o on -S on -s (S/../.././02|L/../../6/03) -m root@telcontar.valinor Isengard:~ #
What else am I missing? Is smartd not using "-d sat,16" somewhere else? Is it some other problem?
Isengard:~ # rpm -q smartmontools smartmontools-6.6-135.1.x86_64 Isengard:~ #
- -- Cheers
Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2
iEYEARECAAYFAlvM5UMACgkQtTMYHG2NR9WCwQCePjOt8PSMKsx6DwSe9bZJRhHf 2lQAn02eBTrtfAqmEg5ydZVagvMfW2r6 =eIzR -----END PGP SIGNATURE-----
smartmon is "sensitive" to some disks.... Not so much others. something is running it periodically in default mode, without the parameters you used to get it to read the disk properly. I'd say cron, but with systemd in the mix who knows. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org