[opensuse] Of software RAID on SUSE Linux
Hi all! I've been pondering some options for how to set up my next SUSE system, and have come to the conclusion that software RAID would most likely serve my needs best. I've only had non-RAIDed, "regular" desktop systems till now, since the days of SuSE 8.0 or thereabout, but have of late become fed up with the reliability issues the typical IDE/SATA drives suffer from and the stress test they pose on one's backup policies. So now that I've decided to have my next box assembled for me from components I select, I've also decided to discard the traditional "one or two primary disks + a backup disk" approach; as despite how often or well I backup, if, or rather when, the disk where the OS lies dies, there's awfully lot of trouble and work anyway. I first looked at SCSI disks as a more reliable alternative -- if a disk can withstand server-level I/O for a few years, it sure as hell will stand up to anything I can throw at it on my humble desktop -- but there still exists a single point of failure unless I RAID-1 two decent-sized SCSI disks. This, with the controllers etc., would be very expensive, at least based on my I've seen thus far. The SCSI drives are also said to be noisy beasts -- a no-no for a desktop at which I will have to spend quite a few hours every day. Next I read about RAID controllers and the possibility of putting a few standard SATA disks into one, thus gaining fault tolerance for simple hardware failures (i.e. a disk dies). Of course, a RAID won't help against other kinds of data loss, such as the power supply's failing and killing other components with some mighty spike; or from fire, or flood, or what ever. That is OK. That's why I backup. Next I learnt that decent RAID controllers are supposedly very expensive. Worse, some people have reported of the controllers themselves breaking down unexpectedly, taking with them all the disks or, with luck, simply failing to function correctly, having to be replaced. The hardware controlled RAID is, so I learnt, often incompatible with other controllers, requiring it to be replaced with one from the same manufacturer or even with the exact same model which may or may not still be readily available, keeping the system down till one is found. This seemed more difficult than I had anticipated. However, based on what I last read about software RAIDs, it seems I actually need not make this more difficult than it has to be. I do not need performance but mere reliability. It seems the Linux software RAID would be able to provide this. I'd like to know if I've understood this correctly, and whether the proposed-below setup would work as expected on the to-be openSUSE 11.1 system. (I've also considered Enterprise Desktop for longer support, but the base system is getting a tad old while the next version won't still be out anytime soon, I'd think. Luckily the regular 11.1 will still include KDE 3.5) So, I would acquire for the system, say, four disk, three of which would be "online" as a RAID-1 setup at any given time, representing together a single virtual disk with x number of partitions, seen by the system as a single block device, while the fourth could sit as an online spare disk that upon the Linux RAID's learning of one disk's failure, would become online as part of the array and would be integrated / reconstructed accordinly. A fifth disk could be kept offline as a spare disk for the one that would get removed, and thus becoming the new online spare disk, and so forth. Or I could have just an offline spare disk that I would insert in the failed one's place during a power off, after which I would have it partitioned as necessary and have it then become part of the RAID array and to be reconstructed. And all would be well again and peace would reign. If this is indeed true, and based on what I read at openSUSE's guide on this at http://en.opensuse.org/How_to_install_SUSE_Linux_on_software_RAID it should be, will it matter at all what kind of (SATA) disks I use for the array as long as they all meet the minimum disk size as dictated by the virtual disk they present? In other words, may I combine disks from different manufacturers, with different specs, etc. without affecting the RAID somehow negatively? I.e. the disks need not be identical; this would be preferred to avoid an unlikely-but-not-unheard-of situation where a defect is shared by a batch of drives from the same manufacturer; or that they wear down the exact same speed and fail close to one another, etc. If one disk is slower than the others, will it impose the speed limit onto the RAID virtual block device? Meaning, that all disks will have to return some kind of "I/O done" before the Linux RAID will request further I/O? (Seems most logical for reliability.) On the other hand, replacing disks later with larger capacity ones will cause no problem as they only need to meet the minimum size criterion. Hence, I could replace them one-by-one such that each new disk, replacing a removed old one, is first let to fully integrate/reconstruct; finally, after all disk have been replaced, I could enlarge a partition or two, for example, as allowed by the new shared minimum, thus increasing the capacity of the virtual device that is being RAIDed. And there would, of course, be regular other backup routines in place, such as rsyncing the RAIDed block device to an external disk via USB/Firewire etc. Also, I presume the kernel would not mind a non-RAIDed internal SATA disk(s) being part of the system as well. Would it be possible to have two separate RAID arrays? (as long as all the disk can be connected and there is space and power available...) Is it really true that the system may still boot gracefully after the disk where the MBR was dies and is replaced during a power down? How exactly will this work -- is the MBR too mirrored on all disks? To me this kind of setup seems now the most convenient one, especially with the guide available on how to install openSUSE with a software RAID and how to monitor it. I'd presume there would be no other benefits in going for SCSI over this except better performance? I also believe hardware RAID controllers would only improve performance, too, while possessing no other relevant factors affecting typical desktop use? It seems I could get five or six 320GB SATA drives for the price of a single 146GB SCSI one, and I would need at least two of those for RAID-1 and a controller too, and extra capacity elsewhere. A lot of money, that is, with no improvement in reliability, it would seem, as three or four SATA's in a RAID with spare disks at hand would seem quite enough despite however unlucky I happened to be with picking the disks. One more question: Will I be able to access each disk's (that make the RAID) SMART data still with hdparm or even though the system would see them as a single block device? (Or so I presume, as they share the same mount points) I'd like to hear if the general ideas and arguments made above are more or less correct. I'm not going to order my new system just yet -- it is still a few weeks away at minimum, as is 11.1 too, I think -- but I'd like to think these issues through well beforehand. Thanks for all comments! Regards, Tero Pesonen -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Tero Pesonen wrote:
I first looked at SCSI disks as a more reliable alternative -- if a disk can withstand server-level I/O for a few years, it sure as hell will stand up to anything I can throw at it on my humble desktop --
From a reliability pov, the really important difference is that SCSI drives are built for 24 hour daily operations, whereas PATA/SATA drives are typically built for 8 hours a day. Look up the MTBF numbers and you'll see that the manufacturers make certain assumptions about the duty-cycle.
In other words, may I combine disksfrom different manufacturers, with different specs, etc. without affecting the RAID somehow negatively?
Yes.
If one disk is slower than the others, will it impose the speed limit onto the RAID virtual block device?
I'm pretty certain it will, yes.
On the other hand, replacing disks later with larger capacity ones will cause no problem as they only need to meet the minimum size criterion. Hence, I could replace them one-by-one such that each new disk, replacing a removed old one, is first let to fully integrate/reconstruct; finally, after all disk have been replaced, I could enlarge a partition or two, for example, as allowed by the new shared minimum, thus increasing the capacity of the virtual device that is being RAIDed.
It might work, but if this is critical functionality, I'd test it first.
Also, I presume the kernel would not mind a non-RAIDed internal SATA disk(s) being part of the system as well.
Correct.
Would it be possible to have two separate RAID arrays? (as long as all the disk can be connected and there is space and power available...)
Yes.
Is it really true that the system may still boot gracefully after the disk where the MBR was dies and is replaced during a power down? How exactly will this work -- is the MBR too mirrored on all disks?
Yes.
To me this kind of setup seems now the most convenient one, especially with the guide available on how to install openSUSE with a software RAID and how to monitor it. I'd presume there would be no other benefits in going for SCSI over this except better performance?
Reliability and better concurrency/performance.
I also believe hardware RAID controllers would only improve performance, too, while possessing no other relevant factors affecting typical desktop use?
Correct.
One more question: Will I be able to access each disk's (that make the RAID) SMART data still with hdparm or even though the system would see them as a single block device? (Or so I presume, as they share the same mount points)
Yes.
I'd like to hear if the general ideas and arguments made above are more or less correct. I'm not going to order my new system just yet -- it is still a few weeks away at minimum, as is 11.1 too, I think -- but I'd like to think these issues through well beforehand.
I've only got 2 x 300Gb SATA drives in RAID1 in my workstation, but otherwise it sounds pretty much like what you're planning. /Per -- /Per Jessen, Zürich -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Thu, Nov 27, 2008 at 5:50 PM, Tero Pesonen <mlist-suse@tpesonen.net> wrote:
Of course, a RAID won't help against other kinds of data loss, such as the power supply's failing and killing other components with some mighty spike; or from fire, or flood, or what ever. That is OK. That's why I backup.
Are you using a UPS?
In other words, may I combine disks from different manufacturers, with different specs, etc. without affecting the RAID somehow negatively?
It is OK to mix manufacturers and capacities. It is good to mix discs with different manufacture dates. This makes it less likely that the discs will fail at the same time. But I would not mix buses, spindle speed or cache size because I would not want to thrash one disc while leaving the other one idle. So, while you can mix discs, I bought an identical pair of hard discs for my RAID 1. Using a pair of identical discs takes the guesswork out of building a RAID. And I am satisfied with the result.
On the other hand, replacing disks later with larger capacity ones will cause no problem as they only need to meet the minimum size criterion. Hence, I could replace them one-by-one such that each new disk, replacing a removed old one, is first let to fully integrate/reconstruct; finally, after all disk have been replaced, I could enlarge a partition or two, for example, as allowed by the new shared minimum, thus increasing the capacity of the virtual device that is being RAIDed.
Remember to back up before experimenting. You might want to look at using LVM rather than expanding partitions. My System I have a three disc system. I have one disc for the OS. And I have a matched pair of Seagates in a RAID 1. The OS disc is a throwaway disc. In other words, if something happens to it, I can just buy a new disc, build a new install of openSuSE and be on my way. I have no personal data on that disc. All of my data is on the RAID 1. And I have a separate mount point for my stuff that does not overlap with any standard Linux mount point such as /home or whatever. So the biggest risks to my data are a double hardware failure or operator error. I also have a removable hard drive with the same capacity as my RAID that I use for backing up. Mike -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Tero Pesonen wrote:
Hi all!
< really big snip >
Thanks for all comments!
Regards, Tero Pesonen
Tero, Sorry for the late post. Software RAID is great. I have 6 openSuSE boxes spinning RAID1 right now and all are 'software' raid. (2 pure software RAID -- 'md raid'; 4 fake RAID [BIOS RAID] -- 'dm raid')(5 using SATA, 1 using ATA) It is definitely the way to go. With 500G SATA II 300 M/sec drives going for $50 now days, there is no reason not to set up a RAID for the added level of redundancy it provides. Just remember RAID does *not* replace backups. There is no trick to setting up with raid. It sounds like you are going to do a fresh install, so just put your drives in the computer, put the install DVD in the drive and start the install as normal. When Yast proposes a partitioning scheme, do the following: (1) choose expert settings; (2) delete all the partitions that yast proposed; (3) on each of the discs you want to mirror, create the partitions and pick the option "[ ] Do Not Format" and set the filesystem type to "Linux RAID". Do this on all mirrored partitions; (4) next choose the RAID button and Create. Yast will then show a list of all the partitions that you have created; (5) next choose Add, and pick a partition from each drive that you will mirror one at a time. When you choose add after selecting a partition you will then assign the filesystem type 'Ext3, etc.' and the mount point. You will also notice that the first pair of partitions selected will be designated /md0. Go through the same steps here twice before moving on, for example once for /boot on sdc5 and once for /boot on sdd5. Now when you look at the screen full of partitions you will have /md0 up top and, continuing with the example, /boot to the right of sdc5 and to the right of sdd5; (6) click finish and goto step (4) for each additional raid set you want to create. You will see the subsequent sets designated as /md1, /md2, etc..; and (7) When you're done, just say OK or confirm like you normally would in the partitioner and move on to software selection. The same process applies to adding new drives and raid sets to an existing install. When it is time for the first boot, everything should work fine. However if it fails to boot and you get a grub error like GRUB ERROR 17, just remember *DO NOT PANIC*. It is usually something simple like a grub menu.lst entry, or for some reason, you may need to do a grub-install /dev/(proper device). On the 6 installs I currently have, probably installed the raid setups 10 times. Out of the ten, I have had boot failures probably 3-4 times that took adjustments. Also, if you are using the BIOS raid, search through the BIOS setting any make sure the /boot or / (if you have no /boot) arrays are *bootable*. The setting can be hard to find sometimes, but if you have problems, double-check this. Do not worry about the 24/7 running of drives. Drives commonly have about 700,000 hours MTBF. That's 79.9 years. My experience has been that drives either fail in the first week, or they last a long time. I had one old IBM Deskstar 40G drive that ran for 7 years 24/7 (it still runs, but I don't use it). During those 7 years I know didn't boot the machine any more that 15 times. (setup, kernel updates and physically moving the box from one office to the next was the only time it ever got rebooted) Good luck, if you get stuck -- write back. -- David C. Rankin, J.D.,P.E. | openSoftware und SystemEntwicklung Rankin Law Firm, PLLC | Countdown for openSuSE 11.1 www.rankinlawfirm.com | http://counter.opensuse.org/11.1/small -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Monday 08 December 2008, you wrote:
Tero Pesonen wrote:
Hi all!
< really big snip >
Thanks for all comments!
Regards, Tero Pesonen
Tero,
Sorry for the late post. Software RAID is great. I have 6 openSuSE boxes spinning RAID1 right now and all are 'software' raid. (2 pure software RAID -- 'md raid'; 4 fake RAID [BIOS RAID] -- 'dm raid')(5 using SATA, 1 using ATA)
It is definitely the way to go. With 500G SATA II 300 M/sec drives going for $50 now days, there is no reason not to set up a RAID for the added level of redundancy it provides. Just remember RAID does *not* replace backups.
There is no trick to setting up with raid. It sounds like you are going to do a fresh install, so just put your drives in the computer, put the install DVD in the drive and start the install as normal. When Yast proposes a partitioning scheme, do the following:
(1) choose expert settings;
(2) delete all the partitions that yast proposed;
(3) on each of the discs you want to mirror, create the partitions and pick the option "[ ] Do Not Format" and set the filesystem type to "Linux RAID". Do this on all mirrored partitions;
(4) next choose the RAID button and Create. Yast will then show a list of all the partitions that you have created;
(5) next choose Add, and pick a partition from each drive that you will mirror one at a time. When you choose add after selecting a partition you will then assign the filesystem type 'Ext3, etc.' and the mount point. You will also notice that the first pair of partitions selected will be designated /md0. Go through the same steps here twice before moving on, for example once for /boot on sdc5 and once for /boot on sdd5. Now when you look at the screen full of partitions you will have /md0 up top and, continuing with the example, /boot to the right of sdc5 and to the right of sdd5;
(6) click finish and goto step (4) for each additional raid set you want to create. You will see the subsequent sets designated as /md1, /md2, etc..; and
(7) When you're done, just say OK or confirm like you normally would in the partitioner and move on to software selection.
The same process applies to adding new drives and raid sets to an existing install. When it is time for the first boot, everything should work fine. However if it fails to boot and you get a grub error like GRUB ERROR 17, just remember *DO NOT PANIC*. It is usually something simple like a grub menu.lst entry, or for some reason, you may need to do a grub-install /dev/(proper device). On the 6 installs I currently have, probably installed the raid setups 10 times. Out of the ten, I have had boot failures probably 3-4 times that took adjustments.
Also, if you are using the BIOS raid, search through the BIOS setting any make sure the /boot or / (if you have no /boot) arrays are *bootable*. The setting can be hard to find sometimes, but if you have problems, double-check this.
Do not worry about the 24/7 running of drives. Drives commonly have about 700,000 hours MTBF. That's 79.9 years. My experience has been that drives either fail in the first week, or they last a long time. I had one old IBM Deskstar 40G drive that ran for 7 years 24/7 (it still runs, but I don't use it). During those 7 years I know didn't boot the machine any more that 15 times. (setup, kernel updates and physically moving the box from one office to the next was the only time it ever got rebooted)
Good luck, if you get stuck -- write back.
The approach you presetented seems to conform exactly to what has been suggested elsewhere, too, when going with a pure software RAID for a fresh SUSE installation (or how to add new drives). I read that Grub may need some adjusting, as you also said. That's OK. I'm quite confident this will work for me, too. Thanks for your nice round-up! Regards, Tero Pesonen -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (4)
-
David C. Rankin
-
Michael Mientus
-
Per Jessen
-
Tero Pesonen