[Bug 816388] New: hdd goes to sleep too often - maybe damages disk
https://bugzilla.novell.com/show_bug.cgi?id=816388 https://bugzilla.novell.com/show_bug.cgi?id=816388#c0 Summary: hdd goes to sleep too often - maybe damages disk Classification: openSUSE Product: openSUSE 12.3 Version: Final Platform: x86-64 OS/Version: openSUSE 12.3 Status: NEW Severity: Critical Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: colAflash@gmx.net QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:20.0) Gecko/20100101 Firefox/20.0 Hi, at first I think this is a critical bug because it may damages the harddisk. But I want to say sorry for the trouble if I'm wrong with that. My system: OS: openSUSE 12.3 x86_64 Model: Thinkpad x220 (Type 4291-36G) CPU: Intel i7-2620M HardDisk: HITACHI HTS723232A7A364 (Size 320072933376 Byte ~ 298 GB) When my notebook is idle I heard some strange clicking. About one click in every 5-20 seconds. After some while I started investigating and found, that with every click this value rises up +1. sudo /usr/sbin/smartctl -A /dev/sda | grep Load_Cycle_Count Right now it's at: 114825 It's the same click-sound when I put my notebook into s2ram or power it down. For me this looks like my harddisk is stopping very often to safe power. As far as I know too often, because so many power-cycles may damage the disk after some while. So I checked this value: ================== # sudo /sbin/hdparm -B /dev/sda /dev/sda: APM_level = 128 ================== After changing it with: sudo /sbin/hdparm -B 192 /dev/sda and putting that command into "/etc/init.d/boot.local" (for setting at boot) and "/etc/pm/sleep.d/99hdparm" (created new, for setting after wake from standby) the clicking stopped or maybe I just stopped hearing it because it didn't happend so often anymore. Also the "Load_Cycle_Count" grew much slower (maybe +10 a day and this is about the times I put my notebook into standby a day). My full "/etc/pm/sleep.d/99hdparm" looks like this: ================== #!/bin/bash case "$1" in hibernate|suspend) #nothing ;; thaw|resume) /sbin/hdparm -B 192 /dev/sda ;; *) ;; esac exit 0 ================== I also saw this behavior on: Thinkpad x220t (Tablet) R-Series Thinkpad (about 8 years old) L-Series Thinkpad Putting the notebooks on ac or battery doesn't changes the behavior. On my desktop pc I got sdb and sdc in a software-mirror-raid configured by the motherboard. When calling "smartctl -A" on sdb or sdc I get "SMART Disabled". But after enabling SMART using /usr/sbin/smartctl -s on /dev/sdX I can see the "Load_Cycle_Count" reported by smartctl rising quite fast too (about +3 per minute). But I'm not hearing any clicking from the disks. Mainboard: ASUS M4A785TD-V EVO # dmraid -r /dev/sdc: pdc, "pdc_bebjhhcbgg", mirror, ok, 3906249984 sectors, data@ 0 /dev/sdb: pdc, "pdc_bebjhhcbgg", mirror, ok, 3906249984 sectors, data@ 0 I can't set the APM_level: ================== # /sbin/hdparm -B 192 /dev/sdc /dev/sdc: setting Advanced Power Management level to 0xc0 (192) HDIO_DRIVE_CMD failed: Input/output error APM_level = not supported ================== Are my thoughts correct? Is one hdd power-cycle per 5 or 20 seconds too much? If yes, this should be fixed VERY SOON, because every day it damages the people's harddisks. Thanks colAflash Reproducible: Always -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c1
--- Comment #1 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c2
--- Comment #2 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c3
--- Comment #3 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c4
--- Comment #4 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c5
--- Comment #5 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c
Xiaolong Li
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c7
--- Comment #7 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c8
Norbert Jurkeit
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c10
--- Comment #10 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c
Wojtek Dziewięcki
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c12
--- Comment #12 from Wojtek Dziewięcki
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c13
--- Comment #13 from Wojtek Dziewięcki
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c14
--- Comment #14 from Wojtek Dziewięcki
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c15
--- Comment #15 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c16
--- Comment #16 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c17
Wojtek Dziewięcki
I don't have /usr/lib/pm-utils/power.d/harddrive on my notebook and on my pc. What should it do or did it do, until which version of openSUSE?
It was removed in 11.4, and it had been used for setting different hdparm parameters when on battery/AC - more dangerous than useful.
I don't really understand why the hdparm manpage says: ...is a very poor choice for use with Linux. Isn't it a poor choice for BSD or Windows too? Does Windows change this value? Sometimes I boot Windows 7 on that pc, but the value is still at "8" (got the value via "hdparm -J").
I think that window$ filesystems don't access the disk as often as linux ones (for example writing journal by ext3/4 wakes the disk every few seconds, I guess), so they don't wake it so soon after it is put to sleep. I don't know what to think apart from that. I'll add Robert to CC, I heard he had some experience with this. Robert, what You think we should do about this? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c18
Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c19
--- Comment #19 from Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c20
Wojtek Dziewięcki
Who sets this value? Is this from pm-utils or what? Is this only openSUSE or any distro out there. If not, how the other distros fixed this?
This has nothing to do with pm-utils. The manufacturer sets this value upon making the drive. I have found nothing about this being fixed in other distros apart from this ubuntu bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/607560 it says that journal had been written too often and that it's been fixed in kernel 2.6.something. Guide to increase the head parking timeout for WD drives: http://www.storagereview.com/how_to_stop_excessive_load_cycles_on_the_wester... (Requires a dos tool provided by WD, wdidle. Hdparm does the same but its manpage says it's beter to use the dos tool.) Btw maybe someone from SUSE could talk to WD and hdparm upstream guy and arrange that WD helps with making this tweak possible in linux too? But this happened on some laptops too, manufacturers tend to produce HDDs with aggressive head parking times, WD Caviar Green is not the only one. Ideas for possible solutions: 1. write an ugly hack-script (probably a systemd unit?) that reads HDD head parking time and sets it to something reasonable if it's too short. Has to be run during boot and probably resume? Seems like something we should avoid. 2. Fix all the stuff that accesses the disk unnecessarily or too often, like journal, syslog, etc. I don't know if this is possible at all? Please assign it to someone else. All I do is maintain hdparm and there is nothing more I can do apart from gathering this information and throwing in some ideas, thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c21
--- Comment #21 from Wojtek Dziewięcki
Btw maybe someone from SUSE could talk to WD and hdparm upstream guy and arrange that WD helps with making this tweak possible in linux too?
Oh I forgot that Moritz mentioned idle3tools before ( http://idle3-tools.sourceforge.net/), so I take this one back :) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c22
--- Comment #22 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c23
Wojtek Dziewięcki
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c24
--- Comment #24 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c25
--- Comment #25 from Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c26
--- Comment #26 from Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c27
--- Comment #27 from Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c28
--- Comment #28 from Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c29
Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c30
--- Comment #30 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c31
Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c32
--- Comment #32 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c33
Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c34
--- Comment #34 from Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c35
--- Comment #35 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c36
--- Comment #36 from Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c37
--- Comment #37 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c38
Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c39
Wojtek Dziewięcki
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c40
Jean Delvare
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c41
--- Comment #41 from Jean Delvare
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c42
--- Comment #42 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c43
--- Comment #43 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c44
--- Comment #44 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c45
--- Comment #45 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c46
--- Comment #46 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c47
--- Comment #47 from Robert Milasan
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c48
--- Comment #48 from Jean Delvare
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c49
--- Comment #49 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c51
--- Comment #51 from Jean Delvare
It looks like hdparm isn't used from somewhere else. The APM_level of 128 after s2ram and after a reboot must be ether: - set by an other tool - set by the kernel - is the hdds default
It is either the HDDs default (each drive can have its own) or set by the BIOS. The kernel isn't involved. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c52
--- Comment #52 from Jean Delvare
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c53
--- Comment #53 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c54
--- Comment #54 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c55
--- Comment #55 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c56
--- Comment #56 from Moritz Duge
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c57
--- Comment #57 from Jean Delvare
https://bugzilla.novell.com/show_bug.cgi?id=816388
https://bugzilla.novell.com/show_bug.cgi?id=816388#c58
--- Comment #58 from Jean Delvare
I was using laptop-mode-tools for some time after I bought my notebook. After upgrading to openSUSE 12.3 I recognized, that laptop-mode-tools gave me no more power saving (on openSUSE 12.2 it still used a lot). So I uninstalled laptop-mode-tools and after that, the hdd-clicking issue began. So on notebook with laptop-mode-tools this bug may not appear, as long as laptop-mode-tools are running.
Indeed, if you look at /etc/laptop-mode/laptop-mode.conf, you'll see: CONTROL_HD_POWERMGMT=1 BATT_HD_POWERMGMT=128 LM_AC_HD_POWERMGMT=254 NOLM_AC_HD_POWERMGMT=254 I _do_ have laptop-mode-tools installed on my laptop and it's on AC most of the time, which explains why I had -B value 254. It gets down to 128 when I switch to battery, and back to 254 when I plug the AC adapter back in. This also explains why /proc/acpi/battery/BAT0/state did not reveal a significant difference: -B value was 128 all along. Doing the proper comparison now, the difference is about 420 mW. Thanks for the tip about powertop only showing power consumption when on battery - same here, of course. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=816388
http://bugzilla.novell.com/show_bug.cgi?id=816388#c63
Robert Milasan
participants (1)
-
bugzilla_noreply@novell.com