[opensuse] A single 80mm exhaust fan can make an 18 deg. difference in drive temps
Listmates, For the avid tinkerer and opensuse devotẽ note that the loss of a single exhaust fan can cause internal case temps to skyrocket and the addition of a single exhaust fan where none existed before can drop internal temps by close to 20 deg. C. An additional note is separating/staggering/spacing hard drives in your hard drive carrier within the case, leaving an empty drive bay open between each hard drive can have the same cooling effect. Both solutions are near zero cost and help fight the worst enemy to the life of your computer/server -- heat. I routinely check drive temps using the package hddtemp called from a simple script that will check all drives and give a single line readout of the temperature of each drive: #!/bin/bash for i in $(cat /proc/partitions | egrep sd[abcdefgh]$ | sed -e 's/^.*s/s/'); do hddtemp /dev/$i done I just put the script in /usr/local/bin as 'hdtemp' and call it from cron with an email sent to myself. Yesterday, I noticed the drive temps on a file server had jumped 18 degrees C to: /dev/sdb: ST3250410AS: 55°C /dev/sdc: ST3250410AS: 56°C Drive manufacturers routinely recommend not exceeding 60 deg. C, but a number of independent test have shown that keeping drive temps below 43 deg. C is more or less the gold standard with drive lifetime dropping rather sharply above 45 degrees. In my case, the culprit was the rear exhaust fan on the case. Simply replacing the 80mm fan brought the airflow in the case back up to normal and the drive temps back down to a reasonable level: /dev/sdb: ST3250410AS: 37°C /dev/sdc: ST3250410AS: 40°C You can find hddtemp here: 10.3 - repositories/home:/houghi/openSUSE_10.3 11.0 & 11.1 - packman Give it a try and see how your cooling measures up. 10 degrees can make a big difference in drive life. -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday 17 June 2009 14:44:21 David C. Rankin wrote:
Listmates,
For the avid tinkerer and opensuse devotẽ note that the loss of a single exhaust fan can cause internal case temps to skyrocket and the addition of a single exhaust fan where none existed before can drop internal temps by close to 20 deg. C. An additional note is separating/staggering/spacing hard drives in your hard drive carrier within the case, leaving an empty drive bay open between each hard drive can have the same cooling effect. Both solutions are near zero cost and help fight the worst enemy to the life of your computer/server -- heat.
I routinely check drive temps using the package hddtemp called from a simple script that will check all drives and give a single line readout of the temperature of each drive:
#!/bin/bash for i in $(cat /proc/partitions | egrep sd[abcdefgh]$ | sed -e 's/^.*s/s/'); do hddtemp /dev/$i done
I just put the script in /usr/local/bin as 'hdtemp' and call it from cron with an email sent to myself.
Nice one, David. I was reassured to find my drive temperatures ranged from 33 to 39 deg. C - only a couple of degrees warmer than me. Bob -- Registered Linux User #463880 FSFE Member #1300 GPG-FP: A6C1 457C 6DBA B13E 5524 F703 D12A FB79 926B 994E openSUSE 11.1, Kernel 2.6.27.21-0.1-default, KDE 4.2.4 Intel Core2 Quad Q9400 2.66GHz, 4GB DDR RAM, nVidia GeForce 9200GS -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday 17 June 2009, David C. Rankin wrote:
Listmates,
For the avid tinkerer and opensuse devotẽ note that the loss of a single exhaust fan can cause internal case temps to skyrocket and the addition of a single exhaust fan where none existed before can drop internal temps by close to 20 deg. C. An additional note is separating/staggering/spacing hard drives in your hard drive carrier within the case, leaving an empty drive bay open between each hard drive can have the same cooling effect. Both solutions are near zero cost and help fight the worst enemy to the life of your computer/server -- heat.
I routinely check drive temps using the package hddtemp called from a simple script that will check all drives and give a single line readout of the temperature of each drive:
#!/bin/bash for i in $(cat /proc/partitions | egrep sd[abcdefgh]$ | sed -e 's/^.*s/s/'); do hddtemp /dev/$i done
I just put the script in /usr/local/bin as 'hdtemp' and call it from cron with an email sent to myself.
Yesterday, I noticed the drive temps on a file server had jumped 18 degrees C to:
/dev/sdb: ST3250410AS: 55°C /dev/sdc: ST3250410AS: 56°C
Drive manufacturers routinely recommend not exceeding 60 deg. C, but a number of independent test have shown that keeping drive temps below 43 deg. C is more or less the gold standard with drive lifetime dropping rather sharply above 45 degrees.
In my case, the culprit was the rear exhaust fan on the case. Simply replacing the 80mm fan brought the airflow in the case back up to normal and the drive temps back down to a reasonable level:
/dev/sdb: ST3250410AS: 37°C /dev/sdc: ST3250410AS: 40°C
You can find hddtemp here:
10.3 - repositories/home:/houghi/openSUSE_10.3 11.0 & 11.1 - packman
Give it a try and see how your cooling measures up. 10 degrees can make a big difference in drive life.
I've been doing this for years.. Both cases have fans that pull in air from outside and blow directly over the drives. In the one case I can actually touch the drives, they never even get warm to the touch.. Mike -- Powered by SuSE 11.0 Kernel 2.6.25 KDE 3.5 Kmail 1.9 7:49pm up 1 day 0:18, 3 users, load average: 3.37, 3.34, 3.38 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 09/06/17 7:44 AM, David C. Rankin wrote:
Listmates,
For the avid tinkerer and opensuse devotẽ note that the loss of a single exhaust fan can cause internal case temps to skyrocket and the addition of a single exhaust fan where none existed before can drop internal temps by close to 20 deg. C. An additional note is separating/staggering/spacing hard drives in your hard drive carrier within the case, leaving an empty drive bay open between each hard drive can have the same cooling effect. Both solutions are near zero cost and help fight the worst enemy to the life of your computer/server -- heat.
<snip>
Give it a try and see how your cooling measures up. 10 degrees can make a big difference in drive life.
Actually HD temp has proven to be of little consequence to harddrive lifespan. Google conducted a test a couple years back and found next to no correlation to HD temp and drive failure. Google did this over 5 years by recording every failure they had and took many variables into account to come up with probably one of the most comprehensive real world hd reliability tests. http://labs.google.com/papers/disk_failures.pdf Dean Hilkewich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Actually HD temp has proven to be of little consequence to harddrive lifespan. Google conducted a test a couple years back and found next to no correlation to HD temp and drive failure. Google did this over 5 years by recording every failure they had and took many variables into account to come up with probably one of the most comprehensive real world hd reliability tests.
Statistics can be misleading. The fact that most drives didn't fail because they were too hot does not put aside the fact that many of them failed because they were too hot. So, it is a good practice to keep the disks at a relatively low temperature. Good practices should be followed regardless of a particular outcome. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday June 17 2009, Miguel Medalha wrote:
...
Good practices should be followed regardless of a particular outcome.
It's only a good practice if it's based in empirical fact, not merely intuition. Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
It's only a good practice if it's based in empirical fact, not merely intuition.
It is an empirical fact that disks *do fail* because they are too hot. Mostly, the electronics fails, some integrated circuit gets burned. Please don't isolate one of my statements from the whole of my post. The central point was: "Statistics can be misleading. The fact that most drives didn't fail because they were too hot does not put aside the fact that many of them failed because they were too hot." We could discuss this forever. *In practice* I am sure you understood what I meant. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
It's only a good practice if it's based in empirical fact, not merely intuition.
Somewhat off-topic, it could certainly be said that intuition, too, is an empirical fact. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday June 17 2009, Miguel Medalha wrote:
It's only a good practice if it's based in empirical fact, not merely intuition.
Somewhat off-topic, it could certainly be said that intuition, too, is an empirical fact.
Absolutely, positively not. Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Somewhat off-topic, it could certainly be said that intuition, too, is an empirical fact.
Absolutely, positively not.
For many people, real intuition -- I am not talking about "guesses" here -- definitely is an empirical fact. The brain has two hemispheres, you know. Subjectivity is -- fortunately -- also a fact of life. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday June 17 2009, Miguel Medalha wrote:
Somewhat off-topic, it could certainly be said that intuition, too, is an empirical fact.
Absolutely, positively not.
For many people, real intuition -- I am not talking about "guesses" here -- definitely is an empirical fact.
The brain has two hemispheres, you know.
Having two hemispheres is entirely beside the point. Both contribute distinct cognitive abilities and both contribute to intuition. Both are even necessary for science, but that's all irrelevant to what intuition is and that category to which it's suggestions belong.
Subjectivity is -- fortunately -- also a fact of life.
The subjective is just that: Existing in the mind of a cognitive agent. It may reflect objective reality, but it need not and often does not. Intuition is the result of unconscious mental processes. That it occurs is a fact and everyone has it, but what a person's intuition suggests is categorically not in the realm of fact. It _always_ requires empirical validation. Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday June 17 2009, Miguel Medalha wrote:
It's only a good practice if it's based in empirical fact, not merely intuition.
It is an empirical fact that disks *do fail* because they are too hot. Mostly, the electronics fails, some integrated circuit gets burned.
Sure, everything fails at some temperature. The question is what is the actual temperature at which hard drives exhibit elevated failures.
Please don't isolate one of my statements from the whole of my post. The central point was:
"Statistics can be misleading. The fact that most drives didn't fail because they were too hot does not put aside the fact that many of them failed because they were too hot."
That, too, is excessively dependent on semantics. If you believe that "most don't" is consistent with "many do," fine. It strikes me as dubious.
We could discuss this forever. *In practice* I am sure you understood what I meant.
I think I do understand you, and I think you're wrong. Hence my reply. Just 'cause you think something is good practice 'cause of some general rule you accept doesn't mean it's true. Only empirical evidence establishes truth. If you want to throw out Google's study because it's at odds with your beliefs, that is your right, but it is not truth. Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
That, too, is excessively dependent on semantics. If you believe that "most don't" is consistent with "many do," fine. It strikes me as dubious.
Not dubious at all. Heat IS a factor on HD drive failure. SOME will fail due to excessive heat. I prefer that NONE of the ones in my possession does fail due to the heat factor -- which is SOMEWHAT under my control -- and hence I keep them within reasonable temperature limits. That's all. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday 17 June 2009 02:36:53 pm Randall R Schulz wrote:
On Wednesday June 17 2009, Miguel Medalha wrote:
...
Good practices should be followed regardless of a particular outcome.
It's only a good practice if it's based in empirical fact, not merely intuition.
Randall Schulz
Shall we revisit the physical correlation between friction and heat, mechanics of materials, or any of the other physical principles involved? I think the google study precisely supports exactly what I said. "keeping drive temps below 43 deg. C is more or less the gold standard with drive lifetime dropping rather sharply above 45 degrees" http://www.3111skyline.com/download/screenshots/driveTempFailure- ProbabilityDensity.jpg (from: http://labs.google.com/papers/disk_failures.pdf - page 6) **Note** google did not even show results on the graph for drive temps greater than 55 C (but notice the trend line of the right side of the graph at 55 C) -> asymptotic -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David C. Rankin wrote:
http://www.3111skyline.com/download/screenshots/driveTempFailure- ProbabilityDensity.jpg
You were complaining about your mail reader breaking URLs into half. One reason is when people forget to format URLs correctly in emails: <http://www.3111skyline.com/download/screenshots/driveTempFailure-ProbabilityDensity.jpg> The carrot brackets are there for a reason. This is the reason! Cheers, Dave Oh, and the graph of course also shows that decreasing temperature below about 37 results in an increased failure rate. Which does not agree exactly, or even vaguely, with what you said. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
In <4A48C2E6.5090907@mrc-lmb.cam.ac.uk>, Dave Howorth wrote:
David C. Rankin wrote:
http://www.3111skyline.com/download/screenshots/driveTempFailure- ProbabilityDensity.jpg
You were complaining about your mail reader breaking URLs into half. One reason is when people forget to format URLs correctly in emails:
How is this:
<http://www.3111skyline.com/download/screenshots/driveTempFailure-Probabil ityDensity.jpg>
Any more "correct" that just writing the URL out? Specifically, what IETF RFC, W3C recommendation, or IEC, ISO, or IEEE standard specifies this? I've used "<URL: $URL >", "<$URL>", and simply "$URL" and none of them give consistently good results. The second is broken just as often as the other two, and some clients stick the trailing '>' in the URL. -- Boyd Stephen Smith Jr. ,= ,-_-. =. bss@iguanasuicide.net ((_/)o o(\_)) ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-' http://iguanasuicide.net/ \_/
* Boyd Stephen Smith Jr. <bss@iguanasuicide.net> [06-29-09 11:54]:
I've used "<URL: $URL >", "<$URL>", and simply "$URL" and none of them give consistently good results. The second is broken just as often as the other two, and some clients stick the trailing '>' in the URL.
Yes, instead of just clicking on the url to select for past or klipper action, you have to grap the start and finish inside the "<>". This is a *broken* _semi_-standard. -- Patrick Shanahan Plainfield, Indiana, USA HOG # US1244711 http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 Registered Linux User #207535 @ http://counter.li.org -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday, 2009-06-29 at 13:50 -0400, Patrick Shanahan wrote:
I've used "<URL: $URL >", "<$URL>", and simply "$URL" and none of them give consistently good results. The second is broken just as often as the other two, and some clients stick the trailing '>' in the URL.
Yes, instead of just clicking on the url to select for past or klipper action, you have to grap the start and finish inside the "<>". This is a *broken* _semi_-standard.
Alpine, for one, knows that it should not wrap a <http://someplace> URL. And hitting [enter] or click on it more or less works. - -- Cheers, Carlos E. R. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAkpJFNEACgkQtTMYHG2NR9WAWwCeJIPCZb5iKxg3jKFn/C5BmH/5 qkEAni96caeIPzbdRF77BMsvNtq2FvQP =fs46 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Boyd Stephen Smith Jr. wrote:
In <4A48C2E6.5090907@mrc-lmb.cam.ac.uk>, Dave Howorth wrote:
David C. Rankin wrote:
http://www.3111skyline.com/download/screenshots/driveTempFailure- ProbabilityDensity.jpg You were complaining about your mail reader breaking URLs into half. One reason is when people forget to format URLs correctly in emails:
How is this:
<http://www.3111skyline.com/download/screenshots/driveTempFailure-Probabil ityDensity.jpg>
Any more "correct" that just writing the URL out? Specifically, what IETF RFC, W3C recommendation, or IEC, ISO, or IEEE standard specifies this?
RFC 3986, though it was in the predecessor RFC2396 since at least 1998. <http://www.faqs.org/rfcs/rfc3986.html> See "Appendix C. Delimiting a URI in Context" and especially "In such cases, it is important to be able to delimit the URI from the rest of the text, and in particular from punctuation marks that might be mistaken for part of the URI. In practice, URIs are delimited in a variety of ways, but usually within double-quotes "http://example.com/", angle brackets <http://example.com/>, or just by using whitespace: http://example.com/ These wrappers do not form part of the URI. In some cases, extra whitespace (spaces, line-breaks, tabs, etc.) may have to be added to break a long URI across lines. The whitespace should be ignored when the URI is extracted. No whitespace should be introduced after a hyphen ("-") character. Because some typesetters and printers may (erroneously) introduce a hyphen at the end of line when breaking it, the interpreter of a URI containing a line break immediately after a hyphen should ignore all whitespace around the line break and should be aware that the hyphen may or may not actually be part of the URI. Using <> angle brackets around each URI is especially recommended as a delimiting style for a reference that contains embedded whitespace"
I've used "<URL: $URL >", "<$URL>", and simply "$URL" and none of them give consistently good results. The second is broken just as often as the other two, and some clients stick the trailing '>' in the URL.
As you can see, it is the interpreting MUA that is broken, not the format. The rules for interpretation are pretty clear, I think. I understand M$ out-of-luck is particularly bad. Carlos wrote:
Alpine, for one, knows that it should not wrap a <http://someplace> URL. And hitting [enter] or click on it more or less works.
Yup, Thunderbird too. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
In <4A49DA54.40103@mrc-lmb.cam.ac.uk>, Dave Howorth wrote:
Boyd Stephen Smith Jr. wrote:
In <4A48C2E6.5090907@mrc-lmb.cam.ac.uk>, Dave Howorth wrote:
You were complaining about your mail reader breaking URLs into half. One reason is when people forget to format URLs correctly in emails:
How is this:
<http://www.3111skyline.com/download/screenshots/driveTempFailure-Proba bil ityDensity.jpg>
Any more "correct" that just writing the URL out? Specifically, what IETF RFC, W3C recommendation, or IEC, ISO, or IEEE standard specifies this?
RFC 3986, though it was in the predecessor RFC2396 since at least 1998.
<http://www.faqs.org/rfcs/rfc3986.html>
See "Appendix C. Delimiting a URI in Context" and especially
"In such cases, it is important to be able to delimit the URI from the rest of the text, and in particular from punctuation marks that might be mistaken for part of the URI.
In practice, URIs are delimited in a variety of ways, but usually within double-quotes "http://example.com/", angle brackets <http://example.com/>, or just by using whitespace:
I'll continue following this RFC by using whitespace. Specifically ASCII code point 0x20 a.k.a <SP>. If your MUA breaks the URL, I'll just tell you to fix your MUA. :P Thanks for the reference though. -- Boyd Stephen Smith Jr. ,= ,-_-. =. bss@iguanasuicide.net ((_/)o o(\_)) ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-' <http://iguanasuicide.net/> \_/
Boyd Stephen Smith Jr. wrote:
I'll continue following this RFC by using whitespace. Specifically ASCII code point 0x20 a.k.a <SP>. If your MUA breaks the URL, I'll just tell you to fix your MUA. :P
Since I now know that you know, I simply won't repair broken URLs in mail from you ;P Hopefully, that won't ever stop me helping you. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Boyd Stephen Smith Jr. wrote:
In <4A49DA54.40103@mrc-lmb.cam.ac.uk>, Dave Howorth wrote:
Boyd Stephen Smith Jr. wrote:
In <4A48C2E6.5090907@mrc-lmb.cam.ac.uk>, Dave Howorth wrote:
You were complaining about your mail reader breaking URLs into half. One reason is when people forget to format URLs correctly in emails:
How is this:
<http://www.3111skyline.com/download/screenshots/driveTempFailure-Proba bil ityDensity.jpg>
Any more "correct" that just writing the URL out? Specifically, what IETF RFC, W3C recommendation, or IEC, ISO, or IEEE standard specifies this?
RFC 3986, though it was in the predecessor RFC2396 since at least 1998.
<http://www.faqs.org/rfcs/rfc3986.html>
See "Appendix C. Delimiting a URI in Context" and especially
"In such cases, it is important to be able to delimit the URI from the rest of the text, and in particular from punctuation marks that might be mistaken for part of the URI.
In practice, URIs are delimited in a variety of ways, but usually within double-quotes "http://example.com/", angle brackets <http://example.com/>, or just by using whitespace:
I'll continue following this RFC by using whitespace. Specifically ASCII code point 0x20 a.k.a <SP>. If your MUA breaks the URL, I'll just tell you to fix your MUA. :P
Thanks for the reference though.
I would say a space is a poor choice. It's not broken for a mua (or anything else) to be unable to tell whether it should preserve some strings yet wrap others if you don't give it anything to go on. Rather, using space to mean more than one thing, sometimes a mere ordinary word seperator, sometimes a special type of string delineator, is a broken spec. The only reason it's defined at all is probably simply to account for the simplest and oldest situation where for whatever reason there should be no sort of markup at all, to a pathological and frankly counter-productive extent. -- bkw Wups, linefeed, dash, dash, space, linefeed. Ahhh! It's a markup! Kill it! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Tue, 30 Jun 2009, Dave Howorth wrote:-
Boyd Stephen Smith Jr. wrote:
Any more "correct" that just writing the URL out? Specifically, what IETF RFC, W3C recommendation, or IEC, ISO, or IEEE standard specifies this?
RFC 3986, though it was in the predecessor RFC2396 since at least 1998.
It was also in RFC1738[0], from December 1994: APPENDIX: Recommendations for URLs in Context URIs, including URLs, are intended to be transmitted through protocols which provide a context for their interpretation. In some cases, it will be necessary to distinguish URLs from other possible data structures in a syntactic structure. In this case, is recommended that URLs be preceeded with a prefix consisting of the characters "URL:". For example, this prefix may be used to distinguish URLs from other kinds of URIs. In addition, there are many occasions when URLs are included in other kinds of text; examples include electronic mail, USENET news messages, or printed on paper. In such cases, it is convenient to have a separate syntactic wrapper that delimits the URL and separates it from the rest of the text, and in particular from punctuation marks that might be mistaken for part of the URL. For this purpose, is recommended that angle brackets ("<" and ">"), along with the prefix "URL:", be used to delimit the boundaries of the URL. This wrapper does not form part of the URL and should not be used in contexts in which delimiters are already specified. Interestingly, RFC2396 removed the recommendation to add URL: to identify a URL. [0] <URL:http://www.faqs.org/rfcs/rfc1738.html> Regards, David Bolt -- Team Acorn: http://www.distributed.net/ OGR-NG @ ~100Mnodes RC5-72 @ ~1Mkeys/s openSUSE 10.3 32b | openSUSE 11.0 32b | | openSUSE 10.3 64b | openSUSE 11.0 64b | openSUSE 11.1 64b | RISC OS 3.6 | RISC OS 3.11 | openSUSE 11.1 PPC | TOS 4.02 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Actually HD temp has proven to be of little consequence to harddrive lifespan. Google conducted a test a couple years back and found next to no correlation to HD temp and drive failure. Google did this over 5 years by recording every failure they had and took many variables into account to come up with probably one of the most comprehensive real world hd reliability tests.
Not "of little consequence". People tend to quickly jump conclusions but in fact the data needs interpretation. Quoting from the Google study: "Overall our experiments can confirm previously reported temperature effects only for the high end of our temperature range and especially for older drives. In the lower and middle temperature ranges, higher temperatures are not associated with higher failure rates. This is a fairly surprising result, which could indicate that datacenter or server designers have more freedom than previously thought when setting operating temperatures for equipment that contains disk drives. We can conclude that at moderate temperature ranges it is likely that there are other effects which affect failure rates much more strongly than temperatures do." Please note: "In the lower and middle temperature ranges, higher temperatures are not associated with higher failure rates." " (...) our experiments can confirm previously reported temperature effects only for the high end of our temperature range" The accompanying graphics also show an increase of failures above 45 degrees Celsius. "We can conclude that at *moderate temperature ranges* [my emphasis] it is likely that there are other effects which affect failure rates much more strongly than temperatures do." So, the point here is that we still need to prevent our HDDs to climb to high temperatures and it is good practice to keep them at "moderate temperature ranges". -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Miguel Medalha wrote:
Actually HD temp has proven to be of little consequence to harddrive lifespan. Google conducted a test a couple years back and found next to no correlation to HD temp and drive failure. Google did this over 5 years by recording every failure they had and took many variables into account to come up with probably one of the most comprehensive real world hd reliability tests.
Not "of little consequence". People tend to quickly jump conclusions but in fact the data needs interpretation.
<snipping much angst> Hi kids! For the physics of the problem, see http://en.wikipedia.org/wiki/Arrhenius_plot "From the Google data, calculate the activation energy for temperature-induced hard drive failure" ;-) -- Tony Alfrey tonyalfrey@earthlink.net "I'd Rather Be Sailing" -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Tony Alfrey wrote:
Miguel Medalha wrote:
Actually HD temp has proven to be of little consequence to harddrive lifespan. Google conducted a test a couple years back and found next to no correlation to HD temp and drive failure. Google did this over 5 years by recording every failure they had and took many variables into account to come up with probably one of the most comprehensive real world hd reliability tests.
Not "of little consequence". People tend to quickly jump conclusions but in fact the data needs interpretation.
<snipping much angst>
Hi kids! For the physics of the problem, see http://en.wikipedia.org/wiki/Arrhenius_plot
"From the Google data, calculate the activation energy for temperature-induced hard drive failure"
Interesting idea. Do you have a link to some evidence that disk failures are governed by an Arrhenius relation? Grain boundaries are physical phenomena rather than chemical. I thought the current evidence pointed to them behaving like a glass. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Dave Howorth wrote:
Tony Alfrey wrote:
Miguel Medalha wrote:
Actually HD temp has proven to be of little consequence to harddrive lifespan. Google conducted a test a couple years back and found next to no correlation to HD temp and drive failure. Google did this over 5 years by recording every failure they had and took many variables into account to come up with probably one of the most comprehensive real world hd reliability tests.
Not "of little consequence". People tend to quickly jump conclusions but in fact the data needs interpretation.
<snipping much angst>
Hi kids! For the physics of the problem, see http://en.wikipedia.org/wiki/Arrhenius_plot
"From the Google data, calculate the activation energy for temperature-induced hard drive failure"
Interesting idea. Do you have a link to some evidence that disk failures are governed by an Arrhenius relation?
Grain boundaries are physical phenomena rather than chemical. I thought the current evidence pointed to them behaving like a glass.
The Arrhenius relation is often applied to stochastic processes with some ill-defined temperature-related causality. Your grain boundary example is good. While there is a specific amount of energy required for a portion of the boundary to make a "break", and while the average thermal energy at room temperature may be well below that energy, there is still a finite probability that the break occurs because the Boltzmann relation provides some tiny fraction of the whole ensemble of 'boundary segments' at a temperature equal to the 'break temperature'. It's like a problem I was once given in Statistical Mechanics: a) Calculate the energy required for a penny sitting on a table to jump up and flip over. b) Calculate the probability that the penny will do this spontaneously at room temperature. The Arrhenius relation is a nice fall-back when the failure rate looks logarithmic. -- Tony Alfrey tonyalfrey@earthlink.net "I'd Rather Be Sailing" -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Tony Alfrey wrote:
Dave Howorth wrote:
Tony Alfrey wrote:
Miguel Medalha wrote:
Actually HD temp has proven to be of little consequence to harddrive lifespan. Google conducted a test a couple years back and found next to no correlation to HD temp and drive failure. Google did this over 5 years by recording every failure they had and took many variables into account to come up with probably one of the most comprehensive real world hd reliability tests.
Not "of little consequence". People tend to quickly jump conclusions but in fact the data needs interpretation.
<snipping much angst>
Hi kids! For the physics of the problem, see http://en.wikipedia.org/wiki/Arrhenius_plot
"From the Google data, calculate the activation energy for temperature-induced hard drive failure"
Interesting idea. Do you have a link to some evidence that disk failures are governed by an Arrhenius relation?
Grain boundaries are physical phenomena rather than chemical. I thought the current evidence pointed to them behaving like a glass.
The Arrhenius relation is often applied to stochastic processes with some ill-defined temperature-related causality. Your grain boundary example is good. While there is a specific amount of energy required for a portion of the boundary to make a "break", and while the average thermal energy at room temperature may be well below that energy, there is still a finite probability that the break occurs because the Boltzmann relation provides some tiny fraction of the whole ensemble of 'boundary segments' at a temperature equal to the 'break temperature'. It's like a problem I was once given in Statistical Mechanics: a) Calculate the energy required for a penny sitting on a table to jump up and flip over. b) Calculate the probability that the penny will do this spontaneously at room temperature.
The Arrhenius relation is a nice fall-back when the failure rate looks logarithmic.
So no actual evidence then :-P -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Dave Howorth wrote:
Tony Alfrey wrote:
Tony Alfrey wrote:
Miguel Medalha wrote:
Actually HD temp has proven to be of little consequence to harddrive lifespan. Google conducted a test a couple years back and found next to no correlation to HD temp and drive failure. Google did this over 5 years by recording every failure they had and took many variables into account to come up with probably one of the most comprehensive real world hd reliability tests.
Not "of little consequence". People tend to quickly jump conclusions but in fact the data needs interpretation.
<snipping much angst>
Hi kids! For the physics of the problem, see http://en.wikipedia.org/wiki/Arrhenius_plot
"From the Google data, calculate the activation energy for temperature-induced hard drive failure" Interesting idea. Do you have a link to some evidence that disk failures are governed by an Arrhenius relation?
Grain boundaries are physical phenomena rather than chemical. I thought the current evidence pointed to them behaving like a glass. The Arrhenius relation is often applied to stochastic processes with some ill-defined temperature-related causality. Your grain boundary example is good. While there is a specific amount of energy required for a portion of the boundary to make a "break", and while the average
Dave Howorth wrote: thermal energy at room temperature may be well below that energy, there is still a finite probability that the break occurs because the Boltzmann relation provides some tiny fraction of the whole ensemble of 'boundary segments' at a temperature equal to the 'break temperature'. It's like a problem I was once given in Statistical Mechanics: a) Calculate the energy required for a penny sitting on a table to jump up and flip over. b) Calculate the probability that the penny will do this spontaneously at room temperature.
The Arrhenius relation is a nice fall-back when the failure rate looks logarithmic.
So no actual evidence then :-P
I'm not sure what "evidence" you would feel comfortable with. If I were to take the Google data and fit it to an Arrhenius plot and calculate an activation energy that was roughly consistent with what one might expect for one of the many possible temperature-related failures in a disk drive, would that be appropriate? This is, by the way, how one does science. If you'd like, I'll do the calc and submit the paper as a letter to Jour. of Phys. C. If the paper gets published, does that then make it "evidence"? If someone else publishes the paper, is that then better evidence than *my* paper? Do you reject the idea that an Arrhenius relationship is *not* appropriate? Do *you* have an alternate proposal? Do you, perhaps, not understand the qualitative analysis I presented above? <sigh> -- Tony Alfrey tonyalfrey@earthlink.net "I'd Rather Be Sailing" -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Tony Alfrey wrote:
Dave Howorth wrote:
Tony Alfrey wrote:
The Arrhenius relation is a nice fall-back when the failure rate looks logarithmic.
So no actual evidence then :-P
I'm not sure what "evidence" you would feel comfortable with. If I were to take the Google data and fit it to an Arrhenius plot and calculate an activation energy that was roughly consistent with what one might expect for one of the many possible temperature-related failures in a disk drive, would that be appropriate? This is, by the way, how one does science. If you'd like, I'll do the calc and submit the paper as a letter to Jour. of Phys. C. If the paper gets published, does that then make it "evidence"? If someone else publishes the paper, is that then better evidence than *my* paper? Do you reject the idea that an Arrhenius relationship is *not* appropriate? Do *you* have an alternate proposal? Do you, perhaps, not understand the qualitative analysis I presented above? <sigh>
So far what you've said is pure speculation. I'd happily accept normal scientific evidence, which is as you say, a paper published in a peer-reviewed journal. Or even a textbook. Since you say that you haven't yet fitted the data yourself, you clearly don't have any evidence of your own to support the statements you made so you would have to be relying on a pre-existing published paper. And only pre-existing papers have been through the peer-review process that helps to confirm their conclusions (though subsequent replication is even better, of course). FWIW, I didn't see any data in the Google paper itself that supported the relationship you suggest - perhaps you'd point it out if I missed it. Not that I saw anything that denied the relationship either. But I'm sure that other equations could be made to fit the few data points there and the flying spaghetti monster could perhaps tell us why if we asked nicely. So I'm firmly of the opinion that additional external evidence is required to support an unequivocal assertion like
Hi kids! For the physics of the problem, see http://en.wikipedia.org/wiki/Arrhenius_plot
Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Dave Howorth wrote: <snip>
So far what you've said is pure speculation.
<snip> Yes. You're right. I think I'll go have lunch now. -- Tony Alfrey tonyalfrey@earthlink.net "I'd Rather Be Sailing" -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
FWIW, I didn't see any data in the Google paper itself that supported the relationship you suggest - perhaps you'd point it out
All of which misses the point. Perhaps the correlation [except in the extreme] of heat to drive failure is low; but the COST of a 80mm fan is also low. The chances of getting in an automobile accident are [individually] very low each time I get into my car - but I always but on my seat belt because the inconvenience [cost] of using the seat belt is very low. If a $4 fan makes a very minor improvement in drive failure probability - then it is $4 well spent. And the "in the extreme" of this does matter. If the cooling at the data center fails you can get extreme temperatures quickly, in that situation the $4 fan helps to mitigate that extreme. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Adam Tauno Williams wrote:
FWIW, I didn't see any data in the Google paper itself that supported the relationship you suggest - perhaps you'd point it out
All of which misses the point. Perhaps the correlation [except in the extreme] of heat to drive failure is low; but the COST of a 80mm fan is also low. The chances of getting in an automobile accident are [individually] very low each time I get into my car - but I always but on my seat belt because the inconvenience [cost] of using the seat belt is very low. If a $4 fan makes a very minor improvement in drive failure probability - then it is $4 well spent.
My original assumption about disk failure was that they would fail statistically (versus temperature) like any other electronic component. But this assumption is suspect. While the Google data are messy (we don't know anything about the drive populations and if they were purchased in blocks at one time from one manufacturer or another) it is hard not to conclude from the Google data that there is an optimal operating temperature for a drive (about 45 C). If your $4 fan keeps the drive temp at 30 C, the Google data indicates that it may cut drive lifetime in half, roughly equivalent to the failure rate at 50 C. So it certainly forces *me* to want to reexamine my assumptions about drives. -- Tony Alfrey tonyalfrey@earthlink.net "I'd Rather Be Sailing" -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hello, On Wed, 17 Jun 2009, David C. Rankin wrote:
#!/bin/bash for i in $(cat /proc/partitions | egrep sd[abcdefgh]$ | sed -e 's/^.*s/s/');
Useless use of cat. Useless use of (e)grep. sed can do all that. sed '/sd[a-h]$/{s/^.*s/s/';}' /proc/partitions or even better: ==== for device in \ $( awk '$4 ~ /^[sh]d[a-z]$/{print $4;}' /proc/partitions ); \ do \ echo "/dev/${device}"; \ done ==== backslashes and semicolons added for easy copy & paste use ==== which should work with plain old IDE, SATA, and both with libata. And assuming (won't boot the new box for this now), that the device is still field 4 in /proc/partitions on current 2.6 kernels.
do hddtemp /dev/$i done
I use a little script directly calling smartctl. ==== BEGIN { i=0; while ( getline < "/proc/partitions" ) { if( $4 ~ /^[sh]d[a-z]$/ ) {devs[i++]=$4;} } for(d in devs) { sc="smartctl -d ata -A /dev/" devs[d]; mf = "/proc/ide/" devs[d] "/model"; getline model < mf; while( sc | getline) { if( $1 == 194 || $2 ~ /emper/ ) { printf "%s: %d °C [%s]\n", devs[d], $10, model; break; } } } } ==== BTW: it'd be no problem converting the temperature from °C to °F within that one awk -- and even adding an option for that ;) -dnh, planning to look into what 'hddtemp' actually does internally -- We all know Linux is great... it does infinite loops in 5 seconds. -- Linus Torvalds -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (15)
-
Adam Tauno Williams
-
Bob Williams
-
Boyd Stephen Smith Jr.
-
Brian K. White
-
Carlos E. R.
-
Dave Howorth
-
David Bolt
-
David C. Rankin
-
David Haller
-
Dean Hilkewich
-
Miguel Medalha
-
Mike
-
Patrick Shanahan
-
Randall R Schulz
-
Tony Alfrey