[opensuse] bigger disks, bigger risks?
Just thinking out loud: when a drive in a 2x500Gb RAID1 array fails and is replaced, how long does it take for the array to recover? I can't help worry about the length of time in degraded mode - the longer, the higher the risk of a second drive failure. What do people do to overcome this - run 3- or 4-drive RAID1? /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
Just thinking out loud:
when a drive in a 2x500Gb RAID1 array fails and is replaced, how long does it take for the array to recover? I can't help worry about the length of time in degraded mode - the longer, the higher the risk of a second drive failure. What do people do to overcome this - run 3- or 4-drive RAID1?
How long it takes for the RAID to recover depends a lot on the load of the server. If it is not too busy, it shouldn't take more than 4-8 hours. Granted, within that span of time the disks are under heavy load and the RAID is in degraded mode. Our new mailserver is indeed running on a RAID10: 4 x RAID1, this is to reduce the striping overhead and maximize io throughput. Disks are cheap, so why not take advantage of that? And of course: don't mistake RAID as a backup solution. (^-^) -- Sandy List replies only please! Please address PMs to: news-reply2 (@) japantest (.) homelinux (.) com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Sandy Drobic wrote:
Per Jessen wrote:
Just thinking out loud:
when a drive in a 2x500Gb RAID1 array fails and is replaced, how long does it take for the array to recover? I can't help worry about the length of time in degraded mode - the longer, the higher the risk of a second drive failure. What do people do to overcome this - run 3- or 4-drive RAID1?
How long it takes for the RAID to recover depends a lot on the load of the server. If it is not too busy, it shouldn't take more than 4-8 hours. Granted, within that span of time the disks are under heavy load and the RAID is in degraded mode.
I'm just now running a recover of a 2 x 1Tb RAID1 - estimated to completion is currently 420min, so 7 hours. That's _much_ too long in degraded mode, IMHO.
Our new mailserver is indeed running on a RAID10: 4 x RAID1, this is to reduce the striping overhead and maximize io throughput. Disks are cheap, so why not take advantage of that?
Absolutely - but a RAID10 is still vulnerable to a dual-drive failure, and that's where I see the bigger risk when it takes 7 hours for a RAID1 to recover. If you don't have a hot spare, add to that whatever time it would take to replace the failed drive.
And of course: don't mistake RAID as a backup solution. (^-^)
Backup is not always an option. Back to the dual-drive failure situation - anyone running 3- or 4-drive RAID1s ? -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
Sandy Drobic wrote:
Just thinking out loud:
when a drive in a 2x500Gb RAID1 array fails and is replaced, how long does it take for the array to recover? I can't help worry about the length of time in degraded mode - the longer, the higher the risk of a second drive failure. What do people do to overcome this - run 3- or 4-drive RAID1? How long it takes for the RAID to recover depends a lot on the load of
Per Jessen wrote: the server. If it is not too busy, it shouldn't take more than 4-8 hours. Granted, within that span of time the disks are under heavy load and the RAID is in degraded mode.
I'm just now running a recover of a 2 x 1Tb RAID1 - estimated to completion is currently 420min, so 7 hours. That's _much_ too long in degraded mode, IMHO.
In that case you have to use smaller faster disks, and more of them. SATA disks don't get a lot faster than 100 MB/s, so the recovery time will stretch with the capacity of the disk. Use 250 MB drives of the same series, and the recovery time will be almost a quarter of the bigger disk. If that still doesn't meet your requirements, forget about RAID1.
Our new mailserver is indeed running on a RAID10: 4 x RAID1, this is to reduce the striping overhead and maximize io throughput. Disks are cheap, so why not take advantage of that?
Absolutely - but a RAID10 is still vulnerable to a dual-drive failure, and that's where I see the bigger risk when it takes 7 hours for a RAID1 to recover. If you don't have a hot spare, add to that whatever time it would take to replace the failed drive.
Only if the second drive is the other raid partner. For such a mission critical server as you propose, I wouldn't use SATA, only SAS, and probably mirror the server as well.
And of course: don't mistake RAID as a backup solution. (^-^)
Backup is not always an option.
??? WTF?? I agree, backup is only the last layer of defense against data loss, but no backup at all is simply not acceptable.
Back to the dual-drive failure situation - anyone running 3- or 4-drive RAID1s ?
Why not use RAID6, if you want the additional security? -- Sandy List replies only please! Please address PMs to: news-reply2 (@) japantest (.) homelinux (.) com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Sandy Drobic wrote:
And of course: don't mistake RAID as a backup solution. (^-^)
Backup is not always an option.
??? WTF?? I agree, backup is only the last layer of defense against data loss, but no backup at all is simply not acceptable.
No need to swear, Sandy :-)
Back to the dual-drive failure situation - anyone running 3- or 4-drive RAID1s ?
Why not use RAID6, if you want the additional security?
Yeah, thanks for reminding me of that. I've been doing a bit of research just now, and RAID6 looks a likely option. -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed November 12 2008 2:57:34 am Per Jessen wrote:
Just thinking out loud:
when a drive in a 2x500Gb RAID1 array fails and is replaced, how long does it take for the array to recover? I can't help worry about the length of time in degraded mode - the longer, the higher the risk of a second drive failure. What do people do to overcome this - run 3- or 4-drive RAID1?
I recently had a drive fail on a 4 drive RAID. Actually (as both an experiment and because it works well), the drive that failed was part of 3 separate MD Raid groups. The first was for the ROOT system, the 2nd was for /home and the last was a 1.3TB RAID 5 that used not only the 1st two drives common to the / and /home arrays (RAID 1s) but the 2 extra drives of the 4 devices. As it turned out, the /dev/sda crashed so badly that even BIOS didn't even recognize the drive existed. This machine was/is primarily used as my file server for my home LAN so I often don't even look at the machine or directly log into it. One day, I did (remotely) log into it and the machine seemed a little sluggish (not much) and I was going to do a system update (from Yast2) of many of the programs. I don't know why, but at some point, I decided to reboot it which is a rarity as they normally run 24x7 for months on end, but when it didn't allow me to reconnect remotely, I went into the other room where the machine physically resides, turned on the monitor and saw it had booted into a maintenance/rescue mode instead of the normal boot. That is where I found one of the drives had failed. To my horror, it was one of the drives with the RAID1 / (root) and /boot and also part of the /home array also, and feared the worse. I replaced the drive and the system booted quite normally and completely. I decided to check the array status with mdadm and other utilities and found that none of the arrays were using the new drive. I then realized I needed to repartition the new drive the same as it was on the defective drive and mdadm -a the drive back to the 3 arrays affected. For root-boot and the home array, the -a and the subsequent rebuild wass almost so fast that it was done before my finger left the <return> key. I thought something was wrong because it was so fast, but it was fine. Then I tackled the 1.3TB raid 5 and did the same mdadm -a on it and it immediately came back. Checking the status showed it was being rebuilt (albeit still available for use during the rebuild time) and the time to completion was 132 minutes for the 1.3TB of data. It completed totally successsfully in just about 2 hours and 15 minutes for a total time to rebuild all three MD RAID devices. I don't know how long it had been running in degraded mode and the 2:15 total repair time was something that made me a believer. Before my stroke, I had taught OS theory including RAID to college students but it was only a theoretical thing about how much RAID was actually worth. Now, it isn't theoretical anymore. RAID is NOT a backup mechanism, good backups still are the only 'protection' for recovery of lost or damaged data, but RAID has convinced me that it is a pretty bulletproof HARDWARE solution for the reliability issue. It often *can* eliminate the need to actually use a backup because it helps prevent data damage in the first place (due to hardware, not pilot, errors) It is an integrity mechinism, not a backup mechanism, but if the harware is 'solid', and pilot error is removed, often backups never get used because the data is always available. Backups are primarily 'cockpit' error recovery, RAID does a great job of providing hardware protection against loss. So, it is no longer theoretical; you need RAID both on the primary AND backup hardware devices. That gives you the best fault tolerance, both human and hardware. That 2:15 hour lesson is one well learned and I wish I still taught because I would now add another 'lab exercise' for my students and provide them the same valuable lesson this 65 year old coot just learned. Richard
/Per
-- /Per Jessen, Zürich
Richard wrote:
Now, it isn't theoretical anymore. RAID is NOT a backup mechanism, good backups still are the only 'protection' for recovery of lost or damaged data, but RAID has convinced me that it is a pretty bulletproof HARDWARE solution for the reliability issue. It often *can* eliminate the need to actually use a backup because it helps prevent data damage in the first place (due to hardware, not pilot, errors) It is an integrity mechinism, not a backup mechanism, but if the harware is 'solid', and pilot error is removed, often backups never get used because the data is always available. Backups are primarily 'cockpit' error recovery, RAID does a great job of providing hardware protection against loss.
It's a slightly different topic, but running complete backups are far from always practical. The enormous amounts of disk-space and 24/7 production requirements make it virtually impossible to run complete backups. Backups are a last resort for when disaster strikes.
So, it is no longer theoretical; you need RAID both on the primary AND backup hardware devices. That gives you the best fault tolerance, both human and hardware.
What do you do about the risk of a dual-drive failure? RAID6 is one possible answer, but AFAIK it requires at least 5 disks, which is too many (for my situation). -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
Richard wrote:
Now, it isn't theoretical anymore. RAID is NOT a backup mechanism, good backups still are the only 'protection' for recovery of lost or damaged data, but RAID has convinced me that it is a pretty bulletproof HARDWARE solution for the reliability issue. It often *can* eliminate the need to actually use a backup because it helps prevent data damage in the first place (due to hardware, not pilot, errors) It is an integrity mechinism, not a backup mechanism, but if the harware is 'solid', and pilot error is removed, often backups never get used because the data is always available. Backups are primarily 'cockpit' error recovery, RAID does a great job of providing hardware protection against loss.
It's a slightly different topic, but running complete backups are far from always practical. The enormous amounts of disk-space and 24/7 production requirements make it virtually impossible to run complete backups. Backups are a last resort for when disaster strikes.
On my new server (2.7 TB RAID) I use LVM snapshots and rsnapshot on another disk as backup. Very fast and reliable.
So, it is no longer theoretical; you need RAID both on the primary AND backup hardware devices. That gives you the best fault tolerance, both human and hardware.
What do you do about the risk of a dual-drive failure? RAID6 is one possible answer, but AFAIK it requires at least 5 disks, which is too many (for my situation).
At least 4 disks are required for RAID6. -- Sandy List replies only please! Please address PMs to: news-reply2 (@) japantest (.) homelinux (.) com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Sandy Drobic wrote:
Per Jessen wrote:
It's a slightly different topic, but running complete backups are far from always practical. The enormous amounts of disk-space and 24/7 production requirements make it virtually impossible to run complete backups. Backups are a last resort for when disaster strikes.
On my new server (2.7 TB RAID) I use LVM snapshots and rsnapshot on another disk as backup. Very fast and reliable.
Interesting, I may have to look into that. We currently run two storage arrays back-to-back, but that's only realistic as long as we're still on a SAN.
So, it is no longer theoretical; you need RAID both on the primary AND backup hardware devices. That gives you the best fault tolerance, both human and hardware.
What do you do about the risk of a dual-drive failure? RAID6 is one possible answer, but AFAIK it requires at least 5 disks, which is too many (for my situation).
At least 4 disks are required for RAID6.
Also very interesting - I'll have to look at that in detail. I really thought 5 disks was the minimum. -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, November 12, 2008 11:23, Per Jessen wrote:
What do you do about the risk of a dual-drive failure? RAID6 is one possible answer, but AFAIK it requires at least 5 disks, which is too many (for my situation).
At least 4 disks are required for RAID6.
Also very interesting - I'll have to look at that in detail. I really thought 5 disks was the minimum.
RAID5 needs one parity disk, RAID6 needs two parity disks. You need at least two data disks to be able to speak about RAID. And then of course all data and parity is striped across all disks in the array. So you can do the math: RAID5: n+1;n>=2 gives a minimum of 3 disks RAID6: n+2;n>=2 gives a minimum of 4 disks You could use a fifth disk as a hot spare. It's more reliable, but it's not a requirement. -- Amedee -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed November 12 2008 4:57:29 am Per Jessen wrote:
Richard wrote:
Now, it isn't theoretical anymore. RAID is NOT a backup mechanism, good backups still are the only 'protection' for recovery of lost or damaged data, but RAID has convinced me that it is a pretty bulletproof HARDWARE solution for the reliability issue. It often *can* eliminate the need to actually use a backup because it helps prevent data damage in the first place (due to hardware, not pilot, errors) It is an integrity mechinism, not a backup mechanism, but if the harware is 'solid', and pilot error is removed, often backups never get used because the data is always available. Backups are primarily 'cockpit' error recovery, RAID does a great job of providing hardware protection against loss.
It's a slightly different topic, but running complete backups are far from always practical. The enormous amounts of disk-space and 24/7 production requirements make it virtually impossible to run complete backups. Backups are a last resort for when disaster strikes.
Totally agree with you there. OFF SITE backups are the only positive way to protect data with any degree of certainty and no version of RAID will provide 'disaster' recovery. If you have a flood or hurricane and your machine is wiped off the face of the earth, it makes no difference *what* version of RAID 'protection' you have. It *is* impractical for most SOHO users to make timely backups of large devices like the new 1+TB drives now hitting the market. If the backup is physically on the same or co-located machine, disaster recovery is largely a myth because all copies of the data is often lost, and backing up a few TB of data real-time does take too much time to complete in many cases making off-machine/off-site backups nearly impossible as the media gets larger. So, while RAID isn't the same as a good backup, it is *way* better than nothing at all in terms of practicality and reducing loss caused by *other* than 'cockpit' errors.
So, it is no longer theoretical; you need RAID both on the primary AND backup hardware devices. That gives you the best fault tolerance, both human and hardware.
What do you do about the risk of a dual-drive failure? RAID6 is one possible answer, but AFAIK it requires at least 5 disks, which is too many (for my situation).
Per, you can always toss more and more drives making the RAID more and more bulletproof, but simultaneous multiple drive failures fortunately are very rare. Raid 6 functionally is one of the more inexpensive ways to add some level of protection against multi-drive failure. If the data is so valuable that the 2-10 hour degraded operation while rebuilding an array that has had a drive fail makes that period of time of exposure to a 2nd drive failure, then NO version of RAID (IMO) is going to be the right solution because it doesn't protect against losses other than local hardware. By that I mean storm, theft, physical or malicious damage and any number of other reasons outside of simple drive failure(s). A SOHO should be able to take the risk that multiple drive failure will not occur in any given 24 hour (or so) period. A large corporation or one like banks, etc, where *any* loss is potentially catestrophic, multiple machine continuous backups including at least one off-site machine needs to be implemented, all with RAID 6 protection. This is the kind of backup that 9-11 at the WTC used and while everything in the buildings was lost including many computers with sensitive data lost, the off-site and transactional backups running continuously allowed for little or no loss of data for the datasets so protected. In my opinion Per, a 4 drive RAID 5 is exposed to 'degraded' operation very infrequently, and when it is, the odds of a 2nd drive failing are almost microscopic, so at a hardware level, a 3 or 4 drive RAID 5 is an acceptible risk. Again, if your data is valuable enough, invest in the extra drive(s) and use RAID 6 or even RAID 6 cascaded with RAID 1 'protection' of the entire array....what is that? Raid 60?
-- /Per Jessen, Zürich
Richard N�����r��y隊Z)z{.�ﮞ˛���m�)z{.��+�Z+i�b�*'jW(�f�vǦj)h���Ǿ��i�������
Richard wrote:
So, it is no longer theoretical; you need RAID both on the primary AND backup hardware devices. That gives you the best fault tolerance, both human and hardware.
What do you do about the risk of a dual-drive failure? RAID6 is one possible answer, but AFAIK it requires at least 5 disks, which is too many (for my situation).
Per, you can always toss more and more drives making the RAID more and more bulletproof, but simultaneous multiple drive failures fortunately are very rare.
The key problem I see is that whilst the risk has always been very low, it is slowly increasing due to the enormous amounts of space per drive.
A SOHO should be able to take the risk that multiple drive failure will not occur in any given 24 hour (or so) period.
Yep, that sounds entirely reasonable.
A large corporation or one like banks, etc, where *any* loss is potentially catestrophic, multiple machine continuous backups including at least one off-site machine needs to be implemented, all with RAID 6 protection. This is the kind of backup that 9-11 at the WTC used and while everything in the buildings was lost including many computers with sensitive data lost, the off-site and transactional backups running continuously allowed for little or no loss of data for the datasets so protected.
There is a very wide gap between your SOHO above and the DR situation of a large corporation, with dual datacentres and all that. In between there are many smaller businesses who can easily afford to take care of the dual drive failure risk whilst they can't afford to protect themselves against a 747 landing on their datacentre :-)
In my opinion Per, a 4 drive RAID 5 is exposed to 'degraded' operation very infrequently,
Very true, but disk-space is so cheap that it's worth looking into. One reason I'm looking into 3-drive RAID1 is the write-performance penalty of RAID5/6.
and when it is, the odds of a 2nd drive failing are almost microscopic, so at a hardware level, a 3 or 4 drive RAID 5 is an acceptible risk. Again, if your data is valuable enough, invest in the extra drive(s) and use RAID 6 or even RAID 6 cascaded with RAID 1 'protection' of the entire array....what is that? Raid 60?
Too many drives involved - we are limited to 4 drives per system. -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
One reason I'm looking into 3-drive RAID1 is the write-performance penalty of RAID5/6.
I've just been googling some and came across someone with an 8-drive RAID1. Which is step one - now I can move on to actually testing it. /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
Per Jessen wrote:
One reason I'm looking into 3-drive RAID1 is the write-performance penalty of RAID5/6.
I've just been googling some and came across someone with an 8-drive RAID1. Which is step one - now I can move on to actually testing it.
Grin! So, what capacity did he have left of these 8 drives? Was it a propriety version like RAID1e? -- Sandy List replies only please! Please address PMs to: news-reply2 (@) japantest (.) homelinux (.) com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Sandy Drobic wrote:
Per Jessen wrote:
Per Jessen wrote:
One reason I'm looking into 3-drive RAID1 is the write-performance penalty of RAID5/6.
I've just been googling some and came across someone with an 8-drive RAID1. Which is step one - now I can move on to actually testing it.
Grin! So, what capacity did he have left of these 8 drives? Was it a propriety version like RAID1e?
I'm assuming he still only had 1/8th of the total capacity. He had 8 drives, with 2 x 8-drive RAID1 (/boot, /root) plus an 8-drive RAID6. I guess the RAID6 array is why he had that many drives. Here's the link: http://ubuntuforums.org/archive/index.php/t-840106.html -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, Nov 12, 2008 at 11:53 AM, Per Jessen
Richard wrote:
So, it is no longer theoretical; you need RAID both on the primary AND backup hardware devices. That gives you the best fault tolerance, both human and hardware.
What do you do about the risk of a dual-drive failure? RAID6 is one possible answer, but AFAIK it requires at least 5 disks, which is too many (for my situation).
Per, you can always toss more and more drives making the RAID more and more bulletproof, but simultaneous multiple drive failures fortunately are very rare.
The key problem I see is that whilst the risk has always been very low, it is slowly increasing due to the enormous amounts of space per drive.
A SOHO should be able to take the risk that multiple drive failure will not occur in any given 24 hour (or so) period.
Yep, that sounds entirely reasonable.
A large corporation or one like banks, etc, where *any* loss is potentially catestrophic, multiple machine continuous backups including at least one off-site machine needs to be implemented, all with RAID 6 protection. This is the kind of backup that 9-11 at the WTC used and while everything in the buildings was lost including many computers with sensitive data lost, the off-site and transactional backups running continuously allowed for little or no loss of data for the datasets so protected.
There is a very wide gap between your SOHO above and the DR situation of a large corporation, with dual datacentres and all that. In between there are many smaller businesses who can easily afford to take care of the dual drive failure risk whilst they can't afford to protect themselves against a 747 landing on their datacentre :-)
In my opinion Per, a 4 drive RAID 5 is exposed to 'degraded' operation very infrequently,
Very true, but disk-space is so cheap that it's worth looking into. One reason I'm looking into 3-drive RAID1 is the write-performance penalty of RAID5/6.
and when it is, the odds of a 2nd drive failing are almost microscopic, so at a hardware level, a 3 or 4 drive RAID 5 is an acceptible risk. Again, if your data is valuable enough, invest in the extra drive(s) and use RAID 6 or even RAID 6 cascaded with RAID 1 'protection' of the entire array....what is that? Raid 60?
Too many drives involved - we are limited to 4 drives per system.
-- /Per Jessen, Zürich
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hi, all I am missing one aspect of this discussion, one I learned about not to long ago: Large hdd's have a bigger chance at a bad block. The maximum accepted bad block rate is expressed in % of blocks, but as there are more blocks on a disk there is a bigger chance at a bad block. I do not have in depth knowledge of this, but I read (in the same article, can't find it now) a rebuild operation cannot continue once it has encountered a bad block. Is this true? Neil -- While working towards the future one should be ensuring that there is a future to work to. ----------------------------------------------------------------------- ** Hi! I'm a signature virus! Copy me into your signature, please! ** ----------------------------------------------------------------------- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Neil wrote:
I am missing one aspect of this discussion, one I learned about not to long ago: Large hdd's have a bigger chance at a bad block. The maximum accepted bad block rate is expressed in % of blocks, but as there are more blocks on a disk there is a bigger chance at a bad block.
I do not have in depth knowledge of this, but I read (in the same article, can't find it now) a rebuild operation cannot continue once it has encountered a bad block.
Is this true?
I think that might well be true - ATA/IDE drives, especially the larger ones, have long been doing their own bad block remapping, so if a drive actually reports a bad block back to the OS, it's probably quite serious. -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
Neil wrote:
I am missing one aspect of this discussion, one I learned about not to long ago: Large hdd's have a bigger chance at a bad block. The maximum accepted bad block rate is expressed in % of blocks, but as there are more blocks on a disk there is a bigger chance at a bad block.
I do not have in depth knowledge of this, but I read (in the same article, can't find it now) a rebuild operation cannot continue once it has encountered a bad block.
Is this true?
I think that might well be true - ATA/IDE drives, especially the larger ones, have long been doing their own bad block remapping, so if a drive actually reports a bad block back to the OS, it's probably quite serious.
Not probably, the disk is about to die. (^-^) If all internal sectors allocated for remapping are exhausted you can bet that drive failure is very close and you should exchange the drive as fast as possible. That is why SMART is monitoring those remapping parameters. -- Sandy List replies only please! Please address PMs to: news-reply2 (@) japantest (.) homelinux (.) com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, Nov 12, 2008 at 6:08 AM, Sandy Drobic
Per Jessen wrote:
Neil wrote:
I am missing one aspect of this discussion, one I learned about not to long ago: Large hdd's have a bigger chance at a bad block. The maximum accepted bad block rate is expressed in % of blocks, but as there are more blocks on a disk there is a bigger chance at a bad block.
I do not have in depth knowledge of this, but I read (in the same article, can't find it now) a rebuild operation cannot continue once it has encountered a bad block.
Is this true?
Yes, and it is one of the reasons raid 6 was developed (2 disks with rebuild info to allow for a double failure. Thus, after a single disk failure you still have 2 ways to recover every sector.) So a small number of media errors on one of the drives can still be recovered by reading from the other recovery source. Another solution is raid 50. ie. create a group of 2-disk mirrors, then from those build a raid 5. Seems like overkill in most situations but it will survive any possible 2-disk failure and conceivably up to half your drives + 1. My next array will likely be raid-6. It is supported in the linux software raid. Not sure how long that has been true.
I think that might well be true - ATA/IDE drives, especially the larger ones, have long been doing their own bad block remapping, so if a drive actually reports a bad block back to the OS, it's probably quite serious.
Not necessarily. Drives only remap bad sectors on write. So if you have sectors you only read that go bad, they will never be repaired by the hard drive itself. There has been discussion that linux software raid should scan an array in the background and when it finds a media error, recreate the data from the rest of the array, then write the known good data back to the bad sector, thus triggering the remapping logic of the drive itself. I don't know if this is part of current generation kernels or not. Possibly it was just talk.
Not probably, the disk is about to die. (^-^)
False, drives experience media errors in the real world all the time. Just means the sector needs to be remapped.
If all internal sectors allocated for remapping are exhausted you can bet that drive failure is very close and you should exchange the drive as fast as possible. That is why SMART is monitoring those remapping parameters.
The if clause above was not part of the original question. (ie. If all internal sectors allocated for remapping are exhausted ...)
-- Sandy
Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, Nov 12, 2008 at 11:01 AM, Greg Freemyer
Per Jessen wrote: Not necessarily. Drives only remap bad sectors on write. So if you have sectors you only read that go bad, they will never be repaired by
On Wed, Nov 12, 2008 at 6:08 AM, Sandy Drobic
wrote: the hard drive itself. There has been discussion that linux software raid should scan an array in the background and when it finds a media error, recreate the data from the rest of the array, then write the known good data back to the bad sector, thus triggering the remapping logic of the drive itself.
I don't know if this is part of current generation kernels or not. Possibly it was just talk.
I asked on the mdraid list. With linux software raid, any time a media error is experienced and the data can be recreated from other drives, the original failed sector is re-written with the assumption it will trigger a re-mapping event. That was implemented around 2.6.16. So that is great, but the kernel does NOT proactively scan the drives looking for bad media sectors thus a rarely read sector could go bad and stay that way for an extended time. Apparently you can implement a background scan via mdadm and invoke that from cron as desired. I my opinion I make that a best practice. I have always used 3ware based arrays in the past, but my investigations into mdraid have been so positive I am likely to switch for the future. HTH Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
In my opinion Per, a 4 drive RAID 5 is exposed to 'degraded' operation very infrequently,
Very true, but disk-space is so cheap that it's worth looking into. One reason I'm looking into 3-drive RAID1 is the write-performance penalty of RAID5/6.
Don't think too hard about it. The difference between RAID5 and RAID6 on a good hardware controller is not too big. As long as the controller has a BBU so it can use write-back instead of write-through. Unfortunately, I don't have a controller to spare for tests at the moment.
and when it is, the odds of a 2nd drive failing are almost microscopic, so at a hardware level, a 3 or 4 drive RAID 5 is an acceptible risk. Again, if your data is valuable enough, invest in the extra drive(s) and use RAID 6 or even RAID 6 cascaded with RAID 1 'protection' of the entire array....what is that? Raid 60?
Too many drives involved - we are limited to 4 drives per system.
In that case use 4 x 15k SAS drives of moderate capacity as RAID6 combined with a good controller (IBM or 3Ware). If that doesn't provide a good enough compromise of 2 disk redundance and speed, you need an external storage. -- Sandy List replies only please! Please address PMs to: news-reply2 (@) japantest (.) homelinux (.) com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Sandy Drobic wrote:
and when it is, the odds of a 2nd drive failing are almost microscopic, so at a hardware level, a 3 or 4 drive RAID 5 is an acceptible risk. Again, if your data is valuable enough, invest in the extra drive(s) and use RAID 6 or even RAID 6 cascaded with RAID 1 'protection' of the entire array....what is that? Raid 60?
Too many drives involved - we are limited to 4 drives per system.
In that case use 4 x 15k SAS drives of moderate capacity as RAID6 combined with a good controller (IBM or 3Ware). If that doesn't provide a good enough compromise of 2 disk redundance and speed, you need an external storage.
Which is what I'm planning on getting rid off :-) We don't need a lot of space - 1-2Tb in total will provide plenty of room for growth for a couple of years. Instead, reliability is paramount - we will quite probably be getting rid of the central SAN storage, and moving to a distributed model. A neat little 1U server like an HP DL160G5 has room for 4 SATA drives (or SAS with the right controller) - we're looking at getting a couple of that type (not necessarily the HP though). -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
Sandy Drobic wrote:
and when it is, the odds of a 2nd drive failing are almost microscopic, so at a hardware level, a 3 or 4 drive RAID 5 is an acceptible risk. Again, if your data is valuable enough, invest in the extra drive(s) and use RAID 6 or even RAID 6 cascaded with RAID 1 'protection' of the entire array....what is that? Raid 60? Too many drives involved - we are limited to 4 drives per system. In that case use 4 x 15k SAS drives of moderate capacity as RAID6 combined with a good controller (IBM or 3Ware). If that doesn't provide a good enough compromise of 2 disk redundance and speed, you need an external storage.
Which is what I'm planning on getting rid off :-)
We don't need a lot of space - 1-2Tb in total will provide plenty of room for growth for a couple of years. Instead, reliability is paramount - we will quite probably be getting rid of the central SAN storage, and moving to a distributed model. A neat little 1U server like an HP DL160G5 has room for 4 SATA drives (or SAS with the right controller) - we're looking at getting a couple of that type (not necessarily the HP though).
Strange, we are currently considering to migrate on to a SAN environment. What is the reason for your move to single server storage and what kind of SAN do your use? -- Sandy List replies only please! Please address PMs to: news-reply2 (@) japantest (.) homelinux (.) com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Sandy Drobic wrote:
We don't need a lot of space - 1-2Tb in total will provide plenty of room for growth for a couple of years. Instead, reliability is paramount - we will quite probably be getting rid of the central SAN storage, and moving to a distributed model. A neat little 1U server like an HP DL160G5 has room for 4 SATA drives (or SAS with the right controller) - we're looking at getting a couple of that type (not necessarily the HP though).
Strange, we are currently considering to migrate on to a SAN environment. What is the reason for your move to single server storage and what kind of SAN do your use?
The reasoning for moving (it's not yet decided, we're exploring the options) is primarily that our existing SAN is getting too old. It's a 72-disk array with a dual 1Gbit fibre, multi-pathing etc. It works really well, but as the disks (SCA80) have all but gone out of production, we're either forced to upgrade or change. (Seagate is still making the drives, but I don't know for how long). Secondary reasons are savings in electricity consumption and cooling needs. The SAN is great for high availability and flexibility, but I think we can achieve the same degree of availability with a distributed setup instead. It would mean a more wasteful use of disk-space, but that's largely irrelevant. -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Per Jessen wrote:
There is a very wide gap between your SOHO above and the DR situation of a large corporation, with dual datacentres and all that. In between there are many smaller businesses who can easily afford to take care of the dual drive failure risk whilst they can't afford to protect themselves against a 747 landing on their datacentre :-)
Well, in-network storage is getting cheaper. If even that costs too much, an enterprising small business could find a similar business on another continent and do a deal. Each buys extra backup disk/tape but instead of making an extra copy of their own backup, they make a copy of their partner's backup. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Dave Howorth wrote:
Per Jessen wrote:
There is a very wide gap between your SOHO above and the DR situation of a large corporation, with dual datacentres and all that. In between there are many smaller businesses who can easily afford to take care of the dual drive failure risk whilst they can't afford to protect themselves against a 747 landing on their datacentre :-)
Well, in-network storage is getting cheaper. If even that costs too much, an enterprising small business could find a similar business on another continent and do a deal. Each buys extra backup disk/tape but instead of making an extra copy of their own backup, they make a copy of their partner's backup.
Not a bad point - I think the main issue is the network cost involved. A 10Mbit fibre connection is still some SFr2000/month around here (I haven't actually checked recently), so unless you have a need for that for regular operation (we don't), those SFr2000 will buy you a lot of drives, monthly :-) Nonetheless, googling "peer-to-peer backup" led me to these two: http://infoscience.epfl.ch/record/85619 http://www.ifi.uzh.ch/pax/web/index.php/publication/show/id/655 I know there is some Swiss start-up out there trying to make a business out of something along those lines, but I couldn't find it. /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday 12 November 2008 03:42, Per Jessen wrote:
...
Nonetheless, googling "peer-to-peer backup" led me to these two:
http://infoscience.epfl.ch/record/85619 http://www.ifi.uzh.ch/pax/web/index.php/publication/show/id/655
I know there is some Swiss start-up out there trying to make a business out of something along those lines, but I couldn't find it.
For several months I've been using CrashPlan, which is a cross-system backup (backup to local drives is a feature that should appear by the end of this year). It can be used between any pair of systems. You pay for a license to back up one (or more, depending on version) system, but not to run a backup server (where backups are stored). I back up my primary development system to two others, one Linux and on Macintosh. All data is encrypted. No license is required to perform restores, only the password used to establish the backup in the first place. They have three functionality tiers, a consumer version with basic features, an extended version and a small-business version. I use the extended version and find it a pretty good balance of control and flexibility vs. complexity. It is available for "all three" OS platforms (it is a Java application). http://www.crashplan.com/
/Per
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, November 12, 2008 11:33, Richard wrote:
In my opinion Per, a 4 drive RAID 5 is exposed to 'degraded' operation very infrequently, and when it is, the odds of a 2nd drive failing are almost microscopic, so at a hardware level, a 3 or 4 drive RAID 5 is an acceptible risk.
If all drives are of identical make, model, batch and production date, then the risk is no longer microscopic. Also, what *caused* drive failure? If it's something like overheated hardware that makes one drive fail, the next drive will fail even sooner. -- Amedee -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Thu November 13 2008 6:40:15 am Amedee Van Gasse wrote:
On Wed, November 12, 2008 11:33, Richard wrote:
In my opinion Per, a 4 drive RAID 5 is exposed to 'degraded' operation very infrequently, and when it is, the odds of a 2nd drive failing are almost microscopic, so at a hardware level, a 3 or 4 drive RAID 5 is an acceptible risk.
If all drives are of identical make, model, batch and production date, then the risk is no longer microscopic. Also, what *caused* drive failure? If it's something like overheated hardware that makes one drive fail, the next drive will fail even sooner.
-- Amedee
There is merit in your caution, however I chose a single manufacture with a product that is NOT the latest, biggest drive they offer, opting for one whose design has been 'proven' (to me at least over a period of many years). I use WD 400G drives because in many years, I only had 1 failure (out of well over 20 drives, some retired due to the fact I simply needed more space, others still in use. That one failure was handled within 72 hours by them with no questionss asked and our drives 'crossed in the mail', quite literraly. The 400G drives were big enough in a 4 drive raid5 array to be useful, cheap enoough to make the 'I' in raId which to me has always meant I(nexpensive D(evices in the Redundant Array acronym. Proven design, reliable company and personal experience all pointed to going against your advice when I constructed my RAID 2 years ago initially. Since then, I have created a 2nd and 3rd in my computers at home. Shoot, I now have 12 drives in arrays at home and added to the previous 20 or so WD drives, I had / have a good feel for WD and that Caviar series of drives. I have recently had my 2nd failure of a drive....I had lost the AC in the room that had the machine with 9 drives in it and that drive simple got too hot and failed. Even so, I maintain that true simultaneous multi-drive failure is a microscopic, albeit non-zero, chance and worth the risk for home and most SOHO environments. I believe in good backup technology for the small, non-zero chance event. For larger businesses with more critical storeage needs and more critical reliabilty requirement, I would recommend raid 6 coupled with real-time (or nearly so) backups to an outside / external system. For really critical stuff, separate but real-time co-datacenters give that true warm fuzzy feeling about reliability and security. When I was responding to the OP (Per Jensen), I did not know what his requirements were so was forced to generalize, but I think the advice I offered was sound. Your advice/cautions are well taken, especially given that often people buy the cheapest, biggest, least-proven drives they can find. They often pay the higher price in the long run due to the very issues you brought up. Richard -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, Nov 12, 2008 at 2:57 AM, Per Jessen
Just thinking out loud:
when a drive in a 2x500Gb RAID1 array fails and is replaced, how long does it take for the array to recover? I can't help worry about the length of time in degraded mode - the longer, the higher the risk of a second drive failure. What do people do to overcome this - run 3- or 4-drive RAID1?
I've seen a lot of relatively new drives failing in the last 6 months. All 500GB or bigger. I even had a double disk failure within the recovery window. Not pretty. Data was not yet on tape, so we had to swap the IDE electronics from a good drive with one of the failed drives, then copy off the data (raw sectors) to another good drive. I'd recommend sticking with slightly older technology for critical raid arrays for now. In particular, Seagate has a firmware issue in their 1.5 TB drives that causes them to burp for 30 seconds every now and then. Linux Software raid sees it as a disk failure and boots the drive. If you have another 30-second burp on one of the remaining drives, you lose your raid-5 array. If you have to use 500GB or bigger drives in arrays. be sure and test the models you are using heavily before putting them in service. The issues I personally have seen all took place in the first 100 or so hours of heavy disk i/o. (Not true of those 1.5TB Seagates.) Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (8)
-
Amedee Van Gasse
-
Dave Howorth
-
Greg Freemyer
-
Neil
-
Per Jessen
-
Randall R Schulz
-
Richard
-
Sandy Drobic