[opensuse] New Generation HHDs in RAID arrays [Was:using "surveillance"' disks in a raid set]
Starting over on this topic. While I wasn't looking HDDs got a major make-over and its a drastic change. This applies only to 5TB and bigger drives from what I can see at this point. And not all of those. Anyway, I looked further into what makes a surveillance HDD different from other types. To my surprise, in the large disk (6/8/10 TB) market Seagate now offers: - Archive HDDs - Surveillance HDDs (Skyhawk) - NAS HDDs (IronWolf) - Performance HDDs (Barracuda) The reason seems to be they are incorporating SMR (shingled magnetic recording) storage tech. This youtube video does what looks like a good explanation of SMR starting 4 minutes in: https://www.youtube.com/watch?v=CR_bfbOTY1o But he doesn't explain that SMR isn't used on 100% of the platter. If it were done, half the entire drive would have to be rewritten with every write. ie. when a SMR sector is written it corrupts one of the sectors next to it. When you re-write the corrupted sector, it corrupts the next one, and before you're done fixing the corruptions you've rewritten half the drive to just write one sector in the middle of the drive. Per http://www.anandtech.com/show/10470/the-evolution-of-hdds-in-the-near-future... the new Seagate drives are interlacing SMR sections of the platter with traditional perpendicular recording. Every time a sector in a SMR region is written, the corruption in that entire SMR has to be fixed, but the corruption process ends at the edge of the particular SMR zone. The "Archive HDDs" have highest ratio of SMR to perpendicular. In some cases a sector write can apparently take up to 20-seconds to complete because the SMR zones are so big and a write at the start of the zone would force a very large part of the SMR zone to be re-written. For some (most? all?) SMR drives, they have a portion of the platter used as a write cache. Let's say 10GB of write cache, so when you write the first 10 GB of a data writing session, it goes straight to that write cache at tradional write speeds. Then when there is idle time the disk controller moves the data out to the much slower SMR regions. So, Seagate is apparently tuning the ratio and configuration of SMR areas and perpendicular recording based on use cases. The above youtube documents someone (like me) trying to use SMR drives not designed for RAID use in a raid array. The raid array apparently started kicking the SMR drives out of the array aggressively because of the very slow i/o speeds. He got lucky and didn't have any data loss, but you clearly could if you're having too many drives kicked out of the array too quickly. Hope that's informative, Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/23/2016 01:03 PM, Greg Freemyer wrote:
Starting over on this topic.
While I wasn't looking HDDs got a major make-over and its a drastic change. This applies only to 5TB and bigger drives from what I can see at this point. And not all of those.
Anyway, I looked further into what makes a surveillance HDD different from other types. To my surprise, in the large disk (6/8/10 TB) market Seagate now offers:
- Archive HDDs - Surveillance HDDs (Skyhawk) - NAS HDDs (IronWolf) - Performance HDDs (Barracuda)
The reason seems to be they are incorporating SMR (shingled magnetic recording) storage tech.
This youtube video does what looks like a good explanation of SMR starting 4 minutes in: https://www.youtube.com/watch?v=CR_bfbOTY1o
But he doesn't explain that SMR isn't used on 100% of the platter. If it were done, half the entire drive would have to be rewritten with every write. ie. when a SMR sector is written it corrupts one of the sectors next to it. When you re-write the corrupted sector, it corrupts the next one, and before you're done fixing the corruptions you've rewritten half the drive to just write one sector in the middle of the drive.
Per http://www.anandtech.com/show/10470/the-evolution-of-hdds-in-the-near-future...
the new Seagate drives are interlacing SMR sections of the platter with traditional perpendicular recording. Every time a sector in a SMR region is written, the corruption in that entire SMR has to be fixed, but the corruption process ends at the edge of the particular SMR zone.
The "Archive HDDs" have highest ratio of SMR to perpendicular. In some cases a sector write can apparently take up to 20-seconds to complete because the SMR zones are so big and a write at the start of the zone would force a very large part of the SMR zone to be re-written.
For some (most? all?) SMR drives, they have a portion of the platter used as a write cache. Let's say 10GB of write cache, so when you write the first 10 GB of a data writing session, it goes straight to that write cache at tradional write speeds. Then when there is idle time the disk controller moves the data out to the much slower SMR regions.
So, Seagate is apparently tuning the ratio and configuration of SMR areas and perpendicular recording based on use cases.
The above youtube documents someone (like me) trying to use SMR drives not designed for RAID use in a raid array. The raid array apparently started kicking the SMR drives out of the array aggressively because of the very slow i/o speeds.
He got lucky and didn't have any data loss, but you clearly could if you're having too many drives kicked out of the array too quickly.
Hope that's informative, Greg
Wow. Talk about a fraud against the consumer marketed as feature. If a technology isn't ready for prime time, maybe it needs to age on the shelf for a few years before deployment. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Greg Freemyer wrote:
The above youtube documents someone (like me) trying to use SMR drives not designed for RAID use in a raid array. The raid array apparently started kicking the SMR drives out of the array aggressively because of the very slow i/o speeds.
Thanks for the summary. I'd like to add. Don't use desktop drives for a RAID. **If** they work (like under a linux SW RAID, instead of w/a HW RAID), they'll be slower and strained. Someone asks why... Not knowing about this, some time ago, got some good desktop drives for a 5 disk SW RAID5. Wasn't more than a few months before it went bad, and performance was barely over a 1 disk write rate (1 disk ~ 100-120(MB/s) linear write), while the 4-data disk RAID couldn't read faster than 200-300 nor write faster than about 150. Very poor performance, IMO. I accidently ordered about 24 Hitachi Deskstars instead of Ultrastars (seller mixed them in w/offers for Ultrastars and might have been hoping no one would notice, not sure). But I decided to try them anyway. Out of 24 drives, 4 were "good" for HW RAID on an LSI card. I don't know *all* of the tests the LSI cards do on disks to determine good/bad, BUT at least one it must do, is whether or not the disks spin at the rated speed. In copying data, I found that the disks varied in spin speed by 20-25% in the worst cases. I.e. instead of 7200 RPM, the slow disks came in at about 6480 and the fast disks came in at about 8280. A single-disk desktop user probably wouldn't have noticed the difference -- maybe in the extremes, but putting them in a RAID... One disk spins almost 28% faster than the other, meaning any writes you do to the disks won't be able to be written out at the same time since the area to be written won't line up (due to constant speed differential). Sent all the disks back (I ate shipping), and ordered new Ultrastars -- more expensive, but worth it (especially now w/5-year replacement warranty). Tested them -- zero were kicked out as non-conforming and all of them ran at the same speed (I assume 7200, but no way to directly measure, really). If you really want the speed in your RAID -- even RAID1, get the more expensive enterprise disks. Imagine a 2 disk RAID1 where the disks have a 25% speed differential. That means they will only line up every 20 revolutions, or about 6 times/second. Compare 167ms to listed seek times (<10) and you can see the possible loss for write (read shouldn't be affected, as it only needs to read one of them). But if you went to a RAID0 or RAID10, RAID speed relies on being able to read & write to each disk in the same amount of time. If disks are off by 25%... this is not good! Let the buyer beware! -l -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 23/12/16 23:23, L.A. Walsh wrote:
Thanks for the summary. I'd like to add. Don't use desktop drives for a RAID. **If** they work (like under a linux SW RAID, instead of w/a HW RAID), they'll be slower and strained. Someone asks why...
There's other good reasons too ... Do a "smartctl -x /dev/sdX" on them. Do they support SCT/ERC? The linux kernel assumes (by default) a drive will respond within 30 seconds. Desktop drives typically take up to 120 seconds if there's a glitch, AND THIS CANNOT BE CHANGED. So the linux kernel times out, the raid code tries to "fix" the disk and fails, and the drive gets kicked for not responding. If the drive supports SCT/ERC, you can change the time the drive takes to time out - typically it's set at 7 seconds. So SCT/ERC drives keep responding to the kernel, and don't get kicked. Barracudas have a particularly poor reputation on the raid mailing list because they are nice drives, people seem to buy them a lot, and they do NOT support SCT/ERC so arrays keep on failing. Made worse by the fact that the 3TB model apparently has a design fault that kills drives... (although this may have been fixed by now). Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Sent all the disks back (I ate shipping), and ordered new Ultrastars -- more expensive, but worth it (especially now w/5-year replacement warranty). Tested them -- zero were kicked out as non-conforming and all of them ran at the same speed (I assume 7200, but no way to directly measure, really).
Well, you actually may: - Just get a not so nice microphone - Spin up your HDD, just spin it up, no IO activity. - Then put your microphone really close to the HDD and record the sound it makes. If you see a somehow constant frequency of 90Hz then it's a 5400 HDD If you see a somehow constant frequency of 120Hz then it's a 7200 HDD Basically, just multiply your recorded frequency by 60, and you get your RPM. Hope it helps you, Cheers, -- Rui Santos Veni, Vidi, Linux -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Rui Santos wrote:
Well, you actually may: - Just get a not so nice microphone
As soon as I saw that, I knew what you were suggesting. Excellent! Not exactly sure how one would determine the recorded frequency, but that's a matter of finding the right tool. Likely, googling for such would make it clearer. Great idea! -l p.s. -- add in the various fans and no wonder I often think I hear music playing in the room where my server is (until I try to isolate the source and then realize it's the computer HW). -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (6)
-
Greg Freemyer
-
John Andersen
-
L A Walsh
-
L.A. Walsh
-
Rui Santos
-
Wols Lists