Optimal RAID setup - opinions invited.
All, probably a little OT here, but I _am_ using SUSE Linux :-) I'm trying to work out the optimal (or near-optimal at least) RAID configuration of a 24-disk array. The array comes with two redundant RAID controllers each with six SCSI channels. These controllers allow all kinds of RAID0/1/5 configurations, but no RAID6. I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ... So I'm sort of looking at choosing between - - using plain software RAID6 and ignoring the hardware RAID facilities of the array. - using the hardware controllers to build a combination of RAID0/1/5 that'll give me the two drive failure survivability. I've had a look around the web, but googling for "two drive failure RAID" almost always leads to someone talking about RAID6 ... So, opinions/suggestions? /Per Jessen, Zürich
Per Jessen wrote:
All,
probably a little OT here, but I _am_ using SUSE Linux :-)
I'm trying to work out the optimal (or near-optimal at least) RAID configuration of a 24-disk array. The array comes with two redundant RAID controllers each with six SCSI channels. These controllers allow all kinds of RAID0/1/5 configurations, but no RAID6.
I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ...
So I'm sort of looking at choosing between -
- using plain software RAID6 and ignoring the hardware RAID facilities of the array. - using the hardware controllers to build a combination of RAID0/1/5 that'll give me the two drive failure survivability.
I've had a look around the web, but googling for "two drive failure RAID" almost always leads to someone talking about RAID6 ...
So, opinions/suggestions?
/Per Jessen, Zürich
It depends on your controller, but hardaware raid should anyway give you better performance. Beside, maybe 24 disks are pretty a lot, but which is the expectation to burst a second disk, before you change the first one? And which is the expectation to burst a second disk before you can change the first one over a bunch of six? I would go with 2 raid5 hw-array. L.
Lorenzo Cerini wrote:
It depends on your controller, but hardaware raid should anyway give you better performance. Beside, maybe 24 disks are pretty a lot, but which is the expectation to burst a second disk, before you change the first one?
Yes, the failure of a 2nd disk while a degraded array is being rebuilt.
And which is the expectation to burst a second disk before you can change the first one over a bunch of six?
The array will have hot spares, so when a disk fails, it will automatically pick a spare from a set and start rebuilding. I have no idea of the actual probability of a 2nd drive failing during the rebuild, but the more drives the higher the risk.
I would go with 2 raid5 hw-array.
RAID15? /Per Jessen, Zürich
Per Jessen wrote:
Lorenzo Cerini wrote:
It depends on your controller, but hardaware raid should anyway give you better performance. Beside, maybe 24 disks are pretty a lot, but which is the expectation to burst a second disk, before you change the first one?
Yes, the failure of a 2nd disk while a degraded array is being rebuilt.
And which is the expectation to burst a second disk before you can change the first one over a bunch of six?
The array will have hot spares, so when a disk fails, it will automatically pick a spare from a set and start rebuilding. I have no idea of the actual probability of a 2nd drive failing during the rebuild, but the more drives the higher the risk.
I would go with 2 raid5 hw-array.
RAID15?
/Per Jessen, Zürich
Just to keep on watching probabilities, costs, disk-space, overhead. If you raid-1 the two raid5 array you lost 7 disk space. Beside any raid thing, wuold be nice to have a daily backup. Likely it's easier that a bomb falls on your server farm and destroy everything than you lost any sensible data with 2 distinct raid5 and a backup politic. What i mean is that the gain of two raid 5 put in raid 1 (with 7 disk space loss) over 2 simple raid 5 array with 2 disk space loss, with a backup politic is not significant. L.
-----Original Message----- From: Per Jessen [mailto:per@computer.org] Sent: Fri 3/24/2006 9:12 AM To: suse-linux-e@suse.com Subject: Re: [SLE] Optimal RAID setup - opinions invited.
I have no idea of the actual probability of a 2nd drive failing during the rebuild, but the more drives the higher the risk.
Well I can tell you from first hand experience, if all of the drive were purchased at the same time, have all been running in the same machine for the same amount of time, and have all been accessed around the same amount of times, your chances of a second drive failing during a rebuild is very high. Especially if you are getting close to the hard drives MTBF. On two different machines I had a second drive, of an eight drive array, fail during a rebuild. The SCSI drives in both servers were on their fourth year of five year MTBF rates. Both servers were purchased at the sime time and both arrays crashed withing two months of each other. Ken
Ken Gramm wrote:
-----Original Message----- From: Per Jessen [mailto:per@computer.org] Sent: Fri 3/24/2006 9:12 AM To: suse-linux-e@suse.com Subject: Re: [SLE] Optimal RAID setup - opinions invited.
I have no idea of the actual probability of a 2nd drive failing during the rebuild, but the more drives the higher the risk.
Well I can tell you from first hand experience, if all of the drive were purchased at the same time, have all been running in the same machine for the same amount of time, and have all been accessed around the same amount of times, your chances of a second drive failing during a rebuild is very high. Especially if you are getting close to the hard drives MTBF.
On two different machines I had a second drive, of an eight drive array, fail during a rebuild. The SCSI drives in both servers were on their fourth year of five year MTBF rates. Both servers were purchased at the sime time and both arrays crashed withing two months of each other.
Ken
As i told before, tht's not the point. I was talking about the possibilities to have a second disk failure before you can rebuild the first array, say 24 hours. I think it's easier somebody forget to be root and start 'rm /var'. One disk failure plus backup is enough. L.
Lorenzo Cerini wrote:
As i told before, tht's not the point. I was talking about the possibilities to have a second disk failure before you can rebuild the first array, say 24 hours.
I am hoping that a rebuild would take a lot less time than that :-) The longer the rebuild, the higher the risk of a second disk failure. And what Ken described would surely increase the risk too.
I think it's easier somebody forget to be root and start 'rm /var'. One disk failure plus backup is enough.
I have to disagree. Restoring a backup is an emergency recovery measure. The whole idea of being able to survive the two-disk failure is avoiding that particular emergency and being able to maintain operation without restoring backups etc. /Per Jessen, Zürich
Per Jessen wrote:
Lorenzo Cerini wrote:
As i told before, tht's not the point. I was talking about the possibilities to have a second disk failure before you can rebuild the first array, say 24 hours.
I am hoping that a rebuild would take a lot less time than that :-) The longer the rebuild, the higher the risk of a second disk failure. And what Ken described would surely increase the risk too.
I think it's easier somebody forget to be root and start 'rm /var'. One disk failure plus backup is enough.
I have to disagree. Restoring a backup is an emergency recovery measure. The whole idea of being able to survive the two-disk failure is avoiding that particular emergency and being able to maintain operation without restoring backups etc.
/Per Jessen, Zürich
I have to disagree too. Wasting 7 disk space, without performance improvement is terrible. So go with raid10 or buy a raid6 card if you really believe you are going to have a double failure within a few a hours. Otherwise have a good raid5 and if a disk fails rebuilt the array. L.
Ken Gramm wrote:
Well I can tell you from first hand experience, if all of the drive were purchased at the same time, have all been running in the same machine for the same amount of time, and have all been accessed around the same amount of times, your chances of a second drive failing during a rebuild is very high. Especially if you are getting close to the hard drives MTBF.
My thoughts exactly. Which is why I'm concerned. /Per Jessen, Zürich
Ken Gramm wrote:
Well I can tell you from first hand experience, if all of the drive were purchased at the same time, have all been running in the same machine for the same amount of time, and have all been accessed around the same amount of times, your chances of a second drive failing during a rebuild is very high. Especially if you are getting close to the hard drives MTBF.
My thoughts exactly. Which is why I'm concerned. I've been watching this thread for a while and it definitely seems to me
Per Jessen wrote: that you are looking for an immediate backup to prevent data loss. RAID is not a backup method, it is a method of storing data. You are going to have a drive failure and recovering from it as quick as possible is the key. I would suggest that you write all your data onto RAID 5 array. The data is then mirrored on another disk which is backed up to tape or optical media. The problem is that with what I have suggested ie that all the disks would be bought at the same time and hence have same MTBF as they are akk used equally. The time taken to write to the mirror disk and then to the optical device could result in data loss so it does not provide the immediate backup solution you seek. Trying to find a solution by using RAID as your backup medium is not going to work as all the disks used will all have the same MTBF. A possible trick to thwart this would be to copy onto flash memory but unless you can have a mass of them connected as a single logical volume it cannot be done. Given theat disks are going to fail perhaps the only solution is to use a rolling RAID 5 +1 array by adding disks into and removing them from the RAID array. I have as an exaple used a 3 disk RAID 5 array but you could quite possibly use a RAID array of 200 disks. Start with 4 disks, 3 in RAID 5 and the 4th as the mirror drive of the data on the RAID array. A month down the line take out one of the RAID drives, as though it had failed and replace it with a new one, store the 'faulty' disk for use later, the same way that a backup tape is stored before its MTBF point. The following month buy another disk and again replace one of the RAID disks ie the mirrored one with the new one. Again store the 'failed' drive as before. On month three 'fail' one of the drives and replace it with the 'failed' drive you took out on month 1. And so on etc..etc.... The key is rotation and to backup from the mirrored drive as it will be the most up-to-date copy of the data sitting on the RAID array. This is the only way I can see that your data will be safe from dual RAID drive failures. If that were to happen you could rebuild the RAID array from the mirrored drive as it is a copy of the data on the RAID array. If that too fails as it will without rotation, you will have to resort to your backup media. HiH -- ======================================================================== Hylton Conacher - Linux user # 229959 at http://counter.li.org Currently using SuSE 9.2 Professional with KDE ========================================================================
Hylton Conacher(ZR1HPC) wrote:
Per Jessen wrote:
Ken Gramm wrote:
Well I can tell you from first hand experience, if all of the drive were purchased at the same time, have all been running in the same machine for the same amount of time, and have all been accessed around the same amount of times, your chances of a second drive failing during a rebuild is very high. Especially if you are getting close to the hard drives MTBF.
My thoughts exactly. Which is why I'm concerned.
I've been watching this thread for a while and it definitely seems to me that you are looking for an immediate backup to prevent data loss.
RAID is not a backup method, it is a method of storing data. You are going to have a drive failure and recovering from it as quick as possible is the key.
From what I saw he was aiming to set up a raid as robust as possible. The goal was rather to do as much as possible to keep the raid (and thus the services using the raid) up and avoid downtime and data loss. If you need the backup you lose the data that changed after the backup until the time of the crash. You also have a downtime for your services. I naturally assume that the data will be backup up, but that will only be the last resort. So the focus of the discussion is rather how to setup a raid as robust as possible with the available hardware. Sandy -- List replies only please! Please address PMs to: news-reply2 (@) japantest (.) homelinux (.) com
----- Original Message -----
From: "Sandy Drobic"
Hylton Conacher(ZR1HPC) wrote:
Per Jessen wrote:
Ken Gramm wrote:
Well I can tell you from first hand experience, if all of the drive were purchased at the same time, have all been running in the same machine for the same amount of time, and have all been accessed around the same amount of times, your chances of a second drive failing during a rebuild is very high. Especially if you are getting close to the hard drives MTBF.
My thoughts exactly. Which is why I'm concerned.
I've been watching this thread for a while and it definitely seems to me that you are looking for an immediate backup to prevent data loss.
RAID is not a backup method, it is a method of storing data. You are going to have a drive failure and recovering from it as quick as possible is the key.
From what I saw he was aiming to set up a raid as robust as possible. The goal was rather to do as much as possible to keep the raid (and thus the services using the raid) up and avoid downtime and data loss.
If you need the backup you lose the data that changed after the backup until the time of the crash. You also have a downtime for your services.
I naturally assume that the data will be backup up, but that will only be the last resort. So the focus of the discussion is rather how to setup a raid as robust as possible with the available hardware.
I only caught the last few emails of this. Did he mention if this was a SCSI setup or fiber setup? I have been researching this same thing for the last few months. We want to offer the most robust configuration but also keep as much performance as possible. So far with my testing a RAID51 has done just that. It is 2 RAID5's mirrored. Now this does use a lot of disk space. And writes are not all that great. But the read's appear to be pretty darn fast. I have 2 enclosures with 16 drives in each. I do a RAID5 on each enclosure with 1 hotspare. Then RAID1 those. Brad Dameron SeaTab Software www.seatab.com
Brad Dameron wrote:
I only caught the last few emails of this. Did he mention if this was a SCSI setup or fiber setup?
It's a fibre setup. That is, the disks in the array are all SCSI, but the controllers connect over FC-AL. /Per Jessen, Zürich
Sandy Drobic wrote:
I naturally assume that the data will be backup up, but that will only be the last resort. So the focus of the discussion is rather how to setup a raid as robust as possible with the available hardware.
Exactly! I couldn't have summed it up any better. /Per Jessen, Zürich
On 3/29/06 11:29 AM, "Hylton Conacher(ZR1HPC)"
Start with 4 disks, 3 in RAID 5 and the 4th as the mirror drive of the data on the RAID array. A month down the line take out one of the RAID drives, as though it had failed and replace it with a new one, store the 'faulty' disk for use later, the same way that a backup tape is stored before its MTBF point.
The following month buy another disk and again replace one of the RAID disks ie the mirrored one with the new one. Again store the 'failed' drive as before.
On month three 'fail' one of the drives and replace it with the 'failed' drive you took out on month 1.
And so on etc..etc....
The key is rotation and to backup from the mirrored drive as it will be the most up-to-date copy of the data sitting on the RAID array.
This is the only way I can see that your data will be safe from dual RAID drive failures. If that were to happen you could rebuild the RAID array from the mirrored drive as it is a copy of the data on the RAID array. If that too fails as it will without rotation, you will have to resort to your backup media.
This is a similar method we use with our main file share...(OSXS) and it works very well. You just have to remember to swap the drives at the specified time. You also have to make sure all your data will fit on one physical drive. We actually do this with three external "swap" drives. One hanging off the server, one in the firebox, and one at the home office in a firebox. They get rotated or swapped each week on Fridays. Just be prepared for some long "rebuild the raid" times depending on the amount of data. This is why we do it on Fridays. No one wants to work on Friday nights... ;) We have about 200 GB. We also then dup to another box in the office and stream to tape from that. Three sets of tape, and archive to DVD for individual projects. You can never have too many "backups". -- Thanks, George "They that would give up essential liberty for temporary safety deserve neither liberty nor safety." Benjamin Franklin
On Friday 24 March 2006 8:17 am, Per Jessen wrote:
All,
probably a little OT here, but I _am_ using SUSE Linux :-)
I'm trying to work out the optimal (or near-optimal at least) RAID configuration of a 24-disk array. The array comes with two redundant RAID controllers each with six SCSI channels. These controllers allow all kinds of RAID0/1/5 configurations, but no RAID6.
I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ...
Depends. Is the data read or write intensive? What performance parameters are you trying to achieve? What budget are you trying to meet? RAID5/6 is slow on writes. RAID10 (RAID0 + RAID 1) with hot spares would be my idea of surviving multiple drive failures at any time and maintaining decent performance for reads and writes.
So I'm sort of looking at choosing between -
- using plain software RAID6 and ignoring the hardware RAID facilities of the array. - using the hardware controllers to build a combination of RAID0/1/5 that'll give me the two drive failure survivability.
Will these controllers be able to talk to each other in a redundant fashion? At the hardware/firmware/driver level?
I've had a look around the web, but googling for "two drive failure RAID" almost always leads to someone talking about RAID6 ...
So, opinions/suggestions?
Give us more info about the specific hardware and the 'application' you are trying to support. I look to eliminate complexity from a system like this. What is the easiest way to monitor, replace and rebuild the hardware/drives and data you are trying to protect? Redundant controllers are nice but if you can't hot-swap a failed one in a running machine can the 'application' afford the downtime to do a replace and rebuild?
/Per Jessen, Zürich
Stan
S Glasoe wrote:
I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ...
Depends. Is the data read or write intensive? What performance parameters are you trying to achieve?
Data is generally write intensive, but reliability is more important than performance. I understand that RAID6 is slow(er) for writes.
What budget are you trying to meet? RAID5/6 is slow on writes. RAID10 (RAID0 + RAID 1) with hot spares would be my idea of surviving multiple drive failures at any time and maintaining decent performance for reads and writes.
But can a RAID10 setup take a two drive failure?
Will these controllers be able to talk to each other in a redundant fashion? At the hardware/firmware/driver level?
At the hardware level - the redundancy is transparent to the host-system.
Give us more info about the specific hardware and the 'application' you are trying to support.
This is an HP/Compaq Storageworks array, connected over FC-AL to two servers. The application is primarily email-storage, probably divided into 90% write, 10% reads.
I look to eliminate complexity from a system like this. What is the easiest way to monitor, replace and rebuild the hardware/drives and data you are trying to protect?
The array controllers take care of all that, except the monitoring.
Redundant controllers are nice but if you can't hot-swap a failed one in a running machine can the 'application' afford the downtime to do a replace and rebuild?
A controller can also be hot-swapped - AFAIK. I'd have to double-check, but I'm pretty certain. Any reason why you haven't discussed RAID6 at all? I haven't looked at it much myself, but it sounds like a pretty good option. /Per Jessen, Zürich
On Fri, 2006-03-24 at 15:17 +0100, Per Jessen wrote:
All,
probably a little OT here, but I _am_ using SUSE Linux :-)
I'm trying to work out the optimal (or near-optimal at least) RAID configuration of a 24-disk array. The array comes with two redundant RAID controllers each with six SCSI channels. These controllers allow all kinds of RAID0/1/5 configurations, but no RAID6.
I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ...
So I'm sort of looking at choosing between -
- using plain software RAID6 and ignoring the hardware RAID facilities of the array. - using the hardware controllers to build a combination of RAID0/1/5 that'll give me the two drive failure survivability.
I've had a look around the web, but googling for "two drive failure RAID" almost always leads to someone talking about RAID6 ...
So, opinions/suggestions?
Search around some more for an array controller that provides raid 6, it is the only way you will get "two drive failure" and make sure -all- of the drives are hot swappable. Employ alert notification using SNMP for when a drive fails and have spares on hand at all times. Also look at: http://www.acnc.com/04_01_06.html for more info. -- Ken Schneider UNIX since 1989, linux since 1994, SuSE since 1998
Ken Schneider wrote:
On Fri, 2006-03-24 at 15:17 +0100, Per Jessen wrote:
All,
probably a little OT here, but I _am_ using SUSE Linux :-)
I'm trying to work out the optimal (or near-optimal at least) RAID configuration of a 24-disk array. The array comes with two redundant RAID controllers each with six SCSI channels. These controllers allow all kinds of RAID0/1/5 configurations, but no RAID6.
I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ...
So I'm sort of looking at choosing between -
- using plain software RAID6 and ignoring the hardware RAID facilities of the array. - using the hardware controllers to build a combination of RAID0/1/5 that'll give me the two drive failure survivability.
I've had a look around the web, but googling for "two drive failure RAID" almost always leads to someone talking about RAID6 ...
So, opinions/suggestions?
Search around some more for an array controller that provides raid 6, it is the only way you will get "two drive failure" and make sure -all- of the drives are hot swappable. Employ alert notification using SNMP for when a drive fails and have spares on hand at all times. Also look at: http://www.acnc.com/04_01_06.html for more info.
Actually, if you setup RAID 10 correctly (striped mirror sets) then you can have half of your drives fail, so long as you don't have two in the same mirror set go at once. Going with RAID 5 or 6 incurrs a write perofrmance penalty which may or may not be tolerated by the app. This can be overcome by a good amount of battery backed cache on the front end though. Essentially, you have performance, reliability, cost as concerns: pick any two and those will drive your decisions. - Herman
Ken Schneider wrote:
Search around some more for an array controller that provides raid 6, it is the only way you will get "two drive failure" and make sure -all- of the drives are hot swappable. Employ alert notification using SNMP for when a drive fails and have spares on hand at all times.
Changing the controller is not an option - and although the firmware is upgradeable and patchable, I don't think HP is likely to add RAID6 at this time. All the drives are hot-swappable, and the array will automatically use hot spares. Herman Knief wrote:
Actually, if you setup RAID 10 correctly (striped mirror sets) then you can have half of your drives fail, so long as you don't have two in the same mirror set go at once.
Exactly. It's that bit of "so long as" I don't like :-)
Going with RAID 5 or 6 incurrs a write perofrmance penalty which may or may not be tolerated by the app. This can be overcome by a good amount of battery backed cache on the front end though. Essentially, you have performance, reliability, cost as concerns: pick any two and those will drive your decisions.
Reliability & cost are the two main criteria right now - and in that order. /Per Jessen, Zürich
On Fri, 24 Mar 2006, Per Jessen wrote:
All,
probably a little OT here, but I _am_ using SUSE Linux :-)
I'm trying to work out the optimal (or near-optimal at least) RAID configuration of a 24-disk array. The array comes with two redundant RAID controllers each with six SCSI channels. These controllers allow all kinds of RAID0/1/5 configurations, but no RAID6.
I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ...
So I'm sort of looking at choosing between -
- using plain software RAID6 and ignoring the hardware RAID facilities of the array. - using the hardware controllers to build a combination of RAID0/1/5 that'll give me the two drive failure survivability.
I've had a look around the web, but googling for "two drive failure RAID" almost always leads to someone talking about RAID6 ...
So, opinions/suggestions?
Go here:
http://en.wikipedia.org/wiki/RAID5
It's really very good.
Remember with raid levels you read them from left-to-right, so, raid 10
is "raid 1 and then raid 0" meaning N arrays of raid1 which are combined
to form a raid0. So raid 50 would be N raid5 arrays combined together
in a (single) raid0.
I prefer Linux's software raid to most hardware raid - I've experienced
several problems with hardware raid but not with software raid, and the
talk about hardware raid being faster isn't backed up with statistics or
facts.
Raid is always a trade-off between space and reliability (and to a
lesser extent, speed - these days a very modest machine can easily do
software raid5 calculations faster than the hardware can deliver the
data ( on my Duron 750 the raid5 checksum speed is 2.7GB/s ). Choose
your tiering such that you can tolerate multiple drive failures at
multiple levels for maximum redundancy.
Raid 55 using only for 9 disks, for example gives you 3 raid 5's
combined to form another raid5.
You can lose /one drive/ from 2/3 of the lower-level raid5's *and* /all/
of the drives from the 3rd lower-level raid5 and still be OK.
Using raid5 with more than 3 disks and you start to raise your risk
factor a bit because 3 disks minimises the number of disks you have and
therefore the number that are likely to fail. Adding more disks raises
the likelihood of failure /without/ reducing the ability to /tolerate/
failure - adding more disks to a raid5 only adds space.
With 24 disks you could go with a fairly complicated but almost
unbelievably robust setup, and in fact with 24 disks if you can afford
the space you can mix some mirroring up in there.
--
Carpe diem - Seize the day.
Carp in denim - There's a fish in my pants!
Jon Nelson
Jon Nelson wrote:
http://en.wikipedia.org/wiki/RAID5 It's really very good.
Hehe, I know - been there already. It was very informative, but I didn't see much on the two drive failure situation.
I prefer Linux's software raid to most hardware raid - I've experienced several problems with hardware raid but not with software raid, and the talk about hardware raid being faster isn't backed up with statistics or facts.
I like Linux' software RAID too - I use RAID1 all over the place. The one thing hardware RAID (with battery backed cache) does is plug that one little hole that software RAID cannot. I can't quite remember what it is, but AFAIR, in case of a power loss, there is a small risk of screwing up a software RAID5 array. The reason I'm looking at the hardware RAID using this array is simply cost and ease-of-use - these 2nd hand Compaq Storageworks arrays are almost being given away and they're hot-plug everything. Get one of these, and perhaps a 2nd one for spareparts and you're set storagewise for quite some time.
With 24 disks you could go with a fairly complicated but almost unbelievably robust setup, and in fact with 24 disks if you can afford the space you can mix some mirroring up in there.
I feel fairly certain mirroring will be part of it, yes. Space isn't too important - if I can make 1Tb out of it and still be able to survive a two-drive failure, I'll be quite happy. /Per Jessen, Zürich
On Friday 24 March 2006 9:52 am, Per Jessen wrote:
Jon Nelson wrote:
http://en.wikipedia.org/wiki/RAID5 It's really very good.
Hehe, I know - been there already. It was very informative, but I didn't see much on the two drive failure situation.
I prefer Linux's software raid to most hardware raid - I've experienced several problems with hardware raid but not with software raid, and the talk about hardware raid being faster isn't backed up with statistics or facts.
I like Linux' software RAID too - I use RAID1 all over the place. The one thing hardware RAID (with battery backed cache) does is plug that one little hole that software RAID cannot. I can't quite remember what it is, but AFAIR, in case of a power loss, there is a small risk of screwing up a software RAID5 array.
The reason I'm looking at the hardware RAID using this array is simply cost and ease-of-use - these 2nd hand Compaq Storageworks arrays are almost being given away and they're hot-plug everything. Get one of these, and perhaps a 2nd one for spareparts and you're set storagewise for quite some time.
With 24 disks you could go with a fairly complicated but almost unbelievably robust setup, and in fact with 24 disks if you can afford the space you can mix some mirroring up in there.
I feel fairly certain mirroring will be part of it, yes. Space isn't too important - if I can make 1Tb out of it and still be able to survive a two-drive failure, I'll be quite happy.
/Per Jessen, Zürich
Go with Jon's suggestions. The only data loss you'll suffer with software RAID is if all power is lost at once and the data that is "in flight" to the drive disappears. The hardware way with battery backup on controllers will eliminate that risk. Best of both worlds is designing a hardware RAID setup that you can layer a software RAID on top of that minimizes all those risks! Stan
On Friday 24 March 2006 11:42 am, S Glasoe wrote: <SNIP>
Go with Jon's suggestions. The only data loss you'll suffer with software RAID is if all power is lost at once and the data that is "in flight" to the drive disappears. The hardware way with battery backup on controllers will eliminate that risk. Best of both worlds is designing a hardware RAID setup that you can layer a software RAID on top of that minimizes all those risks!
Stan
Perhaps two hardware RAID 5 arrays with hot spares and a software mirror of the two. -- Louis Richards
Louis Richards wrote:
Perhaps two hardware RAID 5 arrays with hot spares and a software mirror of the two.
Yeah, that seems to be a good option. Unfortunately the Storageworks controller does not support mirrors built on top of RAID5 sets, although it does support striping across multiple RAID5s. Still, software mirroring of two hardware RAID5s sounds like a pretty good way to go. /Per Jessen, Zürich
On Sunday 26 March 2006 1:13 pm, Per Jessen wrote:
Louis Richards wrote:
Perhaps two hardware RAID 5 arrays with hot spares and a software mirror of the two.
Yeah, that seems to be a good option. Unfortunately the Storageworks controller does not support mirrors built on top of RAID5 sets,
Drop the parity on the RAID5 and you have a RAID10. What's the point of wasting the time calculating parity? You lose space with RAID5 mirrored to another RAID5 versus same number of drives in a RAID10. Same number of points of failure also. RAID10 baby!
although it does support striping across multiple RAID5s.
RAID0 issues too. Any one RAID5 array goes down = no data. Are you thinking a RAID5 array is all on the same controller channel or are you thinking of spreading a RAID5 array across multiple channels on the same controller? Better performance spread across channels. Then if you RAID0 (via software RAID or via controller redundancy?) those RAID5 arrays is there a single point of failure?
Still, software mirroring of two hardware RAID5s sounds like a pretty good way to go.
Across controllers?
/Per Jessen, Zürich
Stan
S Glasoe wrote:
Yeah, that seems to be a good option. Unfortunately the Storageworks controller does not support mirrors built on top of RAID5 sets,
Drop the parity on the RAID5 and you have a RAID10. What's the point of wasting the time calculating parity? You lose space with RAID5 mirrored to another RAID5 versus same number of drives in a RAID10. Same number of points of failure also. RAID10 baby!
But with a RAID15 (two mirrored RAID5s) I can survive a two-disk fsilure - one in each RAID5 or two in one RAID5. That doesn't quite work with a RAID10 array. It'll take any number of failures in any one stripe, but if two failures occur one in each stripe, I'm dead. Or am I missing something?
RAID0 issues too. Any one RAID5 array goes down = no data. Are you thinking a RAID5 array is all on the same controller channel or are you thinking of spreading a RAID5 array across multiple channels on the same controller?
There are two controller with each six SCSI channels, but only one controller will be visible at any one time.
Still, software mirroring of two hardware RAID5s sounds like a pretty good way to go.
Across controllers?
No, it would be on the same controller. /Per Jessen, Zürich
Quoting Per Jessen
S Glasoe wrote:
Yeah, that seems to be a good option. Unfortunately the Storageworks controller does not support mirrors built on top of RAID5 sets,
There are two controller with each six SCSI channels, but only one controller will be visible at any one time.
Still, software mirroring of two hardware RAID5s sounds like a pretty good way to go.
Across controllers?
No, it would be on the same controller.
FWIW, we have seen 5-disk RAID5 arrays run with two failed disks. So, if you want maximum redundancy and have only one controller (with two channels), I'd set up one five-disk RAID5 array on each channel, with two hot spares (or one hot spare for each hardware RAID5 array if the controller does not support "floating" hot spares.) Then you can use software RAID1 with the RAID5 logical disks. Unless you are going to buy an IBM Shark or some other hardware storage device that has nested RAID arrays, that's about as good as you are going to get, IMHO. Mark -- _________________________________________________________ A Message From... L. Mark Stone Reliable Networks of Maine, LLC "We manage your network so you can manage your business" 477 Congress Street Portland, ME 04101 Tel: (207) 772-5678 Web: http://www.rnome.com This email was sent from Reliable Networks of Maine LLC. It may contain information that is privileged and confidential. If you suspect that you were not intended to receive it, please delete it and notify us as soon as possible. Thank you.
On Sun, 26 Mar 2006, L. Mark Stone wrote:
Quoting Per Jessen
: S Glasoe wrote:
Yeah, that seems to be a good option. Unfortunately the Storageworks controller does not support mirrors built on top of RAID5 sets,
There are two controller with each six SCSI channels, but only one controller will be visible at any one time.
Still, software mirroring of two hardware RAID5s sounds like a pretty good way to go.
Across controllers?
No, it would be on the same controller.
FWIW, we have seen 5-disk RAID5 arrays run with two failed disks. So, if you want maximum redundancy and have only one controller (with two channels), I'd set up one five-disk RAID5 array on each channel, with two hot spares (or one hot spare for each hardware RAID5 array if the controller does not support "floating" hot spares.) Then you can use software RAID1 with the RAID5 logical disks. Unless you are going to buy an IBM Shark or some other hardware storage device that has nested RAID arrays, that's about as good as you are going to get, IMHO.
5-disk raid5 arrays with 2 failed disks. I don't believe you.
Either you weren't using 5 "active" disks or it wasn't raid5. Somebody
show me how you can have a 5-disk raid5 with 2 failed disks (and remain
operational).
--
Carpe diem - Seize the day.
Carp in denim - There's a fish in my pants!
Jon Nelson
On Sun, 26 Mar 2006, Per Jessen wrote:
S Glasoe wrote:
Yeah, that seems to be a good option. Unfortunately the Storageworks controller does not support mirrors built on top of RAID5 sets,
Drop the parity on the RAID5 and you have a RAID10. What's the point of wasting the time calculating parity? You lose space with RAID5 mirrored to another RAID5 versus same number of drives in a RAID10. Same number of points of failure also. RAID10 baby!
But with a RAID15 (two mirrored RAID5s) I can survive a two-disk fsilure - one in each RAID5 or two in one RAID5. That doesn't quite work with a RAID10 array. It'll take any number of failures in any one stripe, but if two failures occur one in each stripe, I'm dead. Or am I missing something?
And that is what is nice about raid 55. Use the hardware (if you like)
to build 3 raid5's. Then use software to raid5 those. You can suffer
/all/ drives failing in one of the component raid5's *and* 1 drive in
the other hardware raid5. Or, you could make 12 raid 1's of two drives
each and build software raid 55 on top of that, or whatever.
--
Carpe diem - Seize the day.
Carp in denim - There's a fish in my pants!
Jon Nelson
Go with Jon's suggestions. The only data loss you'll suffer with software RAID is if all power is lost at once and the data that is "in flight" to the drive disappears. The hardware way with battery backup on controllers will eliminate that risk. Best of both worlds is designing a hardware RAID setup that you can layer a software RAID on top of that minimizes all those risks!
Don't bother with RAID5; with todays storage prices it isn't worth the risks, complexity, and crappy performance. http://www.baarf.com/
Adam Tauno Williams wrote:
Go with Jon's suggestions. The only data loss you'll suffer with software RAID is if all power is lost at once and the data that is "in flight" to the drive disappears. The hardware way with battery backup on controllers will eliminate that risk. Best of both worlds is designing a hardware RAID setup that you can layer a software RAID on top of that minimizes all those risks!
Don't bother with RAID5; with todays storage prices it isn't worth the risks, complexity, and crappy performance. http://www.baarf.com/
Yes, I've come across that site too, also read a couple of the articles. I agree that disk-space is now so cheeap that mirroring is quite reasonable - except if I want to survive a two-disk failure. That would require three times the amount of disk-space (instead of just twice the amount). /Per Jessen, Zürich
On 3/24/06, Per Jessen
All,
probably a little OT here, but I _am_ using SUSE Linux :-)
I'm trying to work out the optimal (or near-optimal at least) RAID configuration of a 24-disk array. The array comes with two redundant RAID controllers each with six SCSI channels. These controllers allow all kinds of RAID0/1/5 configurations, but no RAID6.
I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ...
So I'm sort of looking at choosing between -
- using plain software RAID6 and ignoring the hardware RAID facilities of the array. - using the hardware controllers to build a combination of RAID0/1/5 that'll give me the two drive failure survivability.
I've had a look around the web, but googling for "two drive failure RAID" almost always leads to someone talking about RAID6 ...
So, opinions/suggestions?
/Per Jessen, Zürich
Per, I did not read all of the responses, but somewhere along the lines you said it was a HP Storage Works box. And some of the responses did not seem to understand raid all that well. I'm certified to design Storage Works raid setups, so if you post details such as the array model number (HSZ80? HSG80?), # of controllers, # of shelves, # of disks per shelf I can give you more feedback. HP rarely recommends raid 5 unless capacity/cost are your driving factors. Normally with Storage Works boxes HP recommends a RAID 10 for reliability and speed. So for you that would be 11 2-drive mirror sets all striped together and 2 hot spares. That can survive 11 failures with one failure per mirror set. As you said somewhere, if you have 2 failures in the same mirror set, your dead. You said somewhere the more drives the more likelihood of a dual drive failure. True, you should always have 2 drives per mirror set so the odds are always the same for a given mirror set. Adding drives simply increases the number of mirror sets you are striping againt and causes a small increase in risk. risk = 11 * odds of a 2-drive mirrorset failure. Storage capacity 11 * one drives capacity. The next step in higher reliability is to move to 3-disk mirror sets. ie. each of the 3 disks have exactly the same info. This is still called RAID 10, but now you will have 7 3-drive mirror sets all striped together with 3 hot spare drives. You can obviously survive a 2 drive failure. risk = 7 * odds of a 3-drive mirrorset failure. Storage capacity 7 * one drives capacity. The next issue you have is a shelf failure. Although rare they do happen. To address this simply ensure that your mirrored drives are always on different shelves. That way even if you have 6 drives on a shelf and the shelf fails you've only lost half of 6 mirror sets. That should be fairly straight-forward to accomplish. Also make sure your hot spares are on different shelves. And for the truly paranoid HP recommends software RAID 1 between 2 different storage works boxes. Not very many people go to this extreme. HTH Greg -- Greg Freemyer The Norcross Group Forensics for the 21st Century
Greg Freemyer wrote:
I did not read all of the responses, but somewhere along the lines you said it was a HP Storage Works box. And some of the responses did not seem to understand raid all that well.
Hi Greg, yep, it's a StorageWorks box. Not sure what the exact model# really is.
I'm certified to design Storage Works raid setups, so if you post details such as the array model number (HSZ80? HSG80?), # of controllers, # of shelves, # of disks per shelf I can give you more feedback.
Dual HSG80, 4 shelves, 6 disks per shelf.
The next step in higher reliability is to move to 3-disk mirror sets. ie. each of the 3 disks have exactly the same info. This is still called RAID 10, but now you will have 7 3-drive mirror sets all striped together with 3 hot spare drives. You can obviously survive a 2 drive failure.
That's one option I've somehow completely managed to avoid ... thanks for mentioning it.
HTH Greg
Yes, it did, thank you very much Greg. Now that you're on-line anyway, here's something I've been pondering - I have a 2nd SWKS box, except this is an older model with dual HSZ70 controllers. I'm fairly certain one box will be plenty, but I can't help thinking if this 2nd box could be hooked up to the first - sort of as a BA370 extension cabinet? I'm guessing I would need to remove the controllers. /Per Jessen, Zürich
On 3/29/06, Per Jessen
Greg Freemyer wrote:
I did not read all of the responses, but somewhere along the lines you said it was a HP Storage Works box. And some of the responses did not seem to understand raid all that well.
Hi Greg,
yep, it's a StorageWorks box. Not sure what the exact model# really is.
I'm certified to design Storage Works raid setups, so if you post details such as the array model number (HSZ80? HSG80?), # of controllers, # of shelves, # of disks per shelf I can give you more feedback.
Dual HSG80, 4 shelves, 6 disks per shelf.
The next step in higher reliability is to move to 3-disk mirror sets. ie. each of the 3 disks have exactly the same info. This is still called RAID 10, but now you will have 7 3-drive mirror sets all striped together with 3 hot spare drives. You can obviously survive a 2 drive failure.
That's one option I've somehow completely managed to avoid ... thanks for mentioning it.
HTH Greg
Yes, it did, thank you very much Greg. Now that you're on-line anyway, here's something I've been pondering - I have a 2nd SWKS box, except this is an older model with dual HSZ70 controllers. I'm fairly certain one box will be plenty, but I can't help thinking if this 2nd box could be hooked up to the first - sort of as a BA370 extension cabinet? I'm guessing I would need to remove the controllers.
/Per Jessen, Zürich
I'm not positive, but I'm pretty sure the shelves for HSZ70 unit should be compatible with the HSG80 unit. You should be able to check the shelves model number. IIRC, an HSG80 controller can drive 6 shelves. You should just be able to plug the SCSI cables that currently run into the HSZ70 controller shelf into the back of the HSG80 controller shelf. To be honest it has been a couple years since I worked with the HSG80 line (there out of production), but I think each shelf (or half shelf) is a single scsi bus. Then the drive slots correspond with the SCSI ID. You said you have 4 shelves that hold 6 drives each. I think these are split-bus shelves. ie. you have 2 logical 6-slot shelves per physical shelf. If with the HSZ70 shelves you have more than 6 of the split-bus shelves you should be able to buy a small PC card that connects to the back of the shelves and shorts out the 2 halves. Then each physical shelf holds 12 drives and counts as one of the 6 shelves your allowed. The more scsi busses the better the performance, so you don't want to convert the split shelfs to full shelves unless you can productively utilize the extra drive slots you gain. ie. With split bus shelves you can have 36 drives max (6 * 6). With full shelves you can have 72 drives max. (12 * 6). Also note that with a shelf failure, both of the split halves of a shelf can fail. So you want to be sure your mirror halves are not in the same physical shelf, not just not in the same logical shelf. Especially if you are considering RAID 5, you need to make sure each of the raid members is in a physically seperate shelf. HTH Greg -- Greg Freemyer The Norcross Group Forensics for the 21st Century
At 12:17 AM 25/03/2006, Per Jessen wrote:
All,
probably a little OT here, but I _am_ using SUSE Linux :-)
I'm trying to work out the optimal (or near-optimal at least) RAID configuration of a 24-disk array. The array comes with two redundant RAID controllers each with six SCSI channels. These controllers allow all kinds of RAID0/1/5 configurations, but no RAID6.
I want the array to be able to survive a two disk failure, so RAID6 would be the obvious choice, but ...
So I'm sort of looking at choosing between -
- using plain software RAID6 and ignoring the hardware RAID facilities of the array. - using the hardware controllers to build a combination of RAID0/1/5 that'll give me the two drive failure survivability.
I've had a look around the web, but googling for "two drive failure RAID" almost always leads to someone talking about RAID6 ...
So, opinions/suggestions?
/Per Jessen, Zürich
mirrored hw raid5 with separate cards per array for i/o and a dualscsiport RAM card in a spare of the mirrored pair for superfast cross transfer. Talk to the Novell 4 folks (if any are still around) as they set some of these up in the last Brainshare we had in Australia. There was a Novell "whitepaper" on them as the idea was new then.
participants (16)
-
Adam Tauno Williams
-
Brad Dameron
-
Greg Freemyer
-
Herman Knief
-
Hylton Conacher(ZR1HPC)
-
Jon Nelson
-
Ken Gramm
-
Ken Schneider
-
L. Mark Stone
-
Lorenzo Cerini
-
Louis Richards
-
Per Jessen
-
S Glasoe
-
Sandy Drobic
-
scsijon
-
suse_gasjr4wd@mac.com