[opensuse] Raid, Raid6, what file system for good fault tolerance?
Hi list, speaking about software raid, not hardware controller based. I am trying to go for some local OpenSuse machine and adding some storage to it. Was considering Raid6, and now reading about a bit and people left and right scaremongering about the larger the disks these days in the double digit terabyte capacities even, the more likely it is that during a reconstruction of a raid subsequent errors would occur. I would absolutely like to keep my data consistent, and I am not thinking about double digit terabytes either, would stick to 2TB or 4TB disks, with Raid6 thats at least 4 physical drives. Now I am wondering if it possible to use a good robust file system that can add some more parity or check blocks or redundancy on top of the hardware disks, to absolutely be able to always read my data. I can't add multiple machines or like those high availability stuff like clusters and what not I read about DRBD (Distributed Replicated Block Device), or maybe I am just too scared by those technical terms or consider myself to be just a simpleton and wanting to keep it rather simple. My use case here is also not constant availablity, when a disk needs to be replaced, so be it, but I don't want to lose my data that I can not ever read certain parts of it again or such stuff. The thing that came to my mind was, if there is some file systems that would add redundancy and robustness onto the mdraid system of the linux kernel? Anyone with some useful insights? Roughly speaking, I was considering some simple pcie esata interfaced controller card and an external case enclosure with esata port and portmulitplier stuff inside, that can present at least 4 physical disks as JBOD, just a bunch of disks, so that the Linux can seem them all separately. Speed and rebuild times are not my concern, but data persistence and data integrity. Not even number of physical disks, I could live with even one of those 8 bay device enclosures and cases that are out there on the market. Thanks for any help and hints. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wednesday 2017-12-13 15:29, cagsm wrote:
Date: Wed, 13 Dec 2017 15:29:54 From: cagsm <cumandgets0mem00f@gmail.com> To: "opensuse@opensuse.org" <opensuse@opensuse.org> Subject: [opensuse] Raid, Raid6, what file system for good fault tolerance?
Hi list,
speaking about software raid, not hardware controller based.
I am trying to go for some local OpenSuse machine and adding some storage to it. Was considering Raid6, and now reading about a bit and people left and right scaremongering about the larger the disks these days in the double digit terabyte capacities even, the more likely it is that during a reconstruction of a raid subsequent errors would occur.
I would absolutely like to keep my data consistent, and I am not thinking about double digit terabytes either, would stick to 2TB or 4TB disks, with Raid6 thats at least 4 physical drives.
Now I am wondering if it possible to use a good robust file system that can add some more parity or check blocks or redundancy on top of the hardware disks, to absolutely be able to always read my data.
I can't add multiple machines or like those high availability stuff like clusters and what not I read about DRBD (Distributed Replicated Block Device), or maybe I am just too scared by those technical terms or consider myself to be just a simpleton and wanting to keep it rather simple.
My use case here is also not constant availablity, when a disk needs to be replaced, so be it, but I don't want to lose my data that I can not ever read certain parts of it again or such stuff.
The thing that came to my mind was, if there is some file systems that would add redundancy and robustness onto the mdraid system of the linux kernel?
Anyone with some useful insights? Roughly speaking, I was considering some simple pcie esata interfaced controller card and an external case enclosure with esata port and portmulitplier stuff inside, that can present at least 4 physical disks as JBOD, just a bunch of disks, so that the Linux can seem them all separately.
Speed and rebuild times are not my concern, but data persistence and data integrity. Not even number of physical disks, I could live with even one of those 8 bay device enclosures and cases that are out there on the market.
I have very good experience using RAID 10 for more than 15 years at low cost. Never had data loss.. any journaling filesystem is good. In the past I also used reiserfs, but had the most problem with it. These eSATA enclosures are quite cheap and handy.. 4 disks per enclosure, per eSATA-connector. The HDDs I use are targeted for video or server appliances (24/7 running, very high MTBF), most important, that all(!) hdds have the same geometry, preferably same model and same fw-revision. Keep invoices for RMAs.. Over the years I have got around 30 RMAs.. RAID 10 is very robust, additionaly I always have a hot spare, so up to 3 out of 5 disks could fail under perfect circumstances.. i think cpu load is way lesser than using RAID5 or 6.. an alternative to mdadm software raid could be btrfs raid, but have only experience with RAID 1, and it seems to lack hot spare support. Most sata port multipliers allow also software SMART monitoring. My most recent bought (internal) multipliers also support real hardware raid, but I did not switch yet (maybe on next HDD upgrade, to give it a test). For scalability I also use LVM. Currently I am using EXT4 and XFS filesystems. Never had problems at all on software side. Problematic were cheap SATA-cabling (bad shielding, bad connectors) - very important beside monitoring pending sectors of course is to monitor UDMA_CRC_Error_Count and Load_Cycle_Count. First are an indicator for electromechanical problems and second, power supply - those degrade by time, so I am also monitoring voltages via ACPI. But most important, still: backups. Cheap solution consumer level big HDD. I do daily incremental backups, using two HDDs, which are exchanged regularly (currently weekly) and store the unused one at an fire safe storage place. Best would be a modern LTO-6 or LTO-7 tape drive with loader... but that's unaffordable at the moment.. Just my experience and tips Paul -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday, 2017-12-13 at 16:06 +0100, Paul Neuwirth wrote:
I have very good experience using RAID 10 for more than 15 years at low cost. Never had data loss.. any journaling filesystem is good. In the past I also used reiserfs, but had the most problem with it. These eSATA enclosures are quite cheap and handy.. 4 disks per enclosure, per eSATA-connector.
A single eSATA connector for a box holding 4 disks? Sounds interesting. Do you have a link to a sample of such enclosure? - -- Cheers, Carlos E. R. (from openSUSE 42.2 x86_64 "Malachite" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAloxZK0ACgkQtTMYHG2NR9UGZACdFcvkt76i1helCesoPGU3RlwJ nE0AnRaIwe4o+d+m0tsv1kUAT/EqW6K/ =AuAF -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wed, Dec 13, 2017 at 6:34 PM, Carlos E. R. <robin.listas@telefonica.net> wrote:
A single eSATA connector for a box holding 4 disks? Sounds interesting. Do you have a link to a sample of such enclosure?
In or via Europe for example there is Fantec brand They have e.g. 8-bay enclosure with usb3 and esata via portmultiplier backplane or something Look for FANTEC QB-X8US3-6G on amazon.com / ebay.com There is also a 4bay case by them I am not in for the speed, so mdraid software based with single esata and port multiplier would suffice for me as I am not in for high speed reads or writes, but I am looking into a robust and resilient setup which is not too complicated. Meanwhile I have read about all sorts of problems of the btrfs builtin raid there, so I guess i will stay away from that, and zfs is also kind of exotic for opensuse? guess it will be some raid10 or raid6 then via mdadm. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Date: Wed, 13 Dec 2017 18:48:46 From: cagsm <cumandgets0mem00f@gmail.com> To: "opensuse@opensuse.org" <opensuse@opensuse.org> Subject: Re: [opensuse] Raid, Raid6, what file system for good fault tolerance?
On Wed, Dec 13, 2017 at 6:34 PM, Carlos E. R. <robin.listas@telefonica.net> wrote:
A single eSATA connector for a box holding 4 disks? Sounds interesting. Do you have a link to a sample of such enclosure?
In or via Europe for example there is Fantec brand They have e.g. 8-bay enclosure with usb3 and esata via portmultiplier backplane or something
Look for FANTEC QB-X8US3-6G on amazon.com / ebay.com
There is also a 4bay case by them
I am not in for the speed, so mdraid software based with single esata and port multiplier would suffice for me as I am not in for high speed reads or writes, but I am looking into a robust and resilient setup which is not too complicated. robustness/100% availability.. what I also recommend is as much redundancy as possible. e.g. on one server I have 2 S-ATA controllers, in total 12 3.5" harddisks using these backplanes http://www.chieftec.com/backplane_CBP.html, to keep HDDs hot swappable. each mirror on one controller. In past I had a controller failing (i
On Wednesday 2017-12-13 18:48, cagsm wrote: think it was bad cabling (of one hdd, but affected whole controller) one time and a bug in the kernel module the other time. with raid 10 setup half of the disks failed, but the array kept running. redundant power supply will be the next step. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wednesday 2017-12-13 18:34, Carlos E. R. wrote:
On Wednesday, 2017-12-13 at 16:06 +0100, Paul Neuwirth wrote:
I have very good experience using RAID 10 for more than 15 years at low cost. Never had data loss.. any journaling filesystem is good. In the past I also used reiserfs, but had the most problem with it. These eSATA enclosures are quite cheap and handy.. 4 disks per enclosure, per eSATA-connector.
A single eSATA connector for a box holding 4 disks? Sounds interesting. Do you have a link to a sample of such enclosure?
this is what I would recommend: https://www.ebay.com/p/Sans-Digital-TowerRAID-Tr5m-b-5-Bay-SATA-to-eSATA-Har... mine are a bit older, costed around $ 120 and only for 4 HDDs, weak point is the integrated power supply, fails after 2-3 years usage ($ 50 oem replacement) of course you need a SATA controller, which has PMP-compatibilty. there's a good list: https://ata.wiki.kernel.org/index.php/SATA_hardware_features older and cheap chips often lack the support. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thursday 2017-12-14 03:20, John Andersen wrote:
On 12/13/2017 09:34 AM, Carlos E. R. wrote:
A single eSATA connector for a box holding 4 disks? Sounds interesting.
Sounds like a bottle neck. it is indeed. But TO did not target at speed. speed mostly depends of controller and port multiplier, whether CBS or FBS is used.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wed, 13 Dec 2017 15:29:54 +0100 cagsm <cumandgets0mem00f@gmail.com> wrote:
Hi list,
speaking about software raid, not hardware controller based.
I am trying to go for some local OpenSuse machine and adding some storage to it. Was considering Raid6, and now reading about a bit and people left and right scaremongering about the larger the disks these days in the double digit terabyte capacities even, the more likely it is that during a reconstruction of a raid subsequent errors would occur.
I would absolutely like to keep my data consistent, and I am not thinking about double digit terabytes either, would stick to 2TB or 4TB disks, with Raid6 thats at least 4 physical drives.
Now I am wondering if it possible to use a good robust file system that can add some more parity or check blocks or redundancy on top of the hardware disks, to absolutely be able to always read my data.
I would also recommend RAID 10 as an alternative. The key with RAID 6, or any RAID really, is to regularly scan read the disks so errors are found and corrected as soon as possible and so disk failures don't appear in bunches. Also buy disks from different batches. As regards filesystems, I would choose XFS for integrity. Of course the other main insurance is to keep separate backups offline! HTH, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wednesday 2017-12-13 21:23, Dave Howorth wrote:
[...] Also buy disks from different batches. good point, thank you. One time I ordered a series of harddisks... which all failed in a row (only weeks, maximum a few months difference). They got all replaced, but it was a lot of work.. I then also switched brand. excluding this incidence I have same failure rates for both brands I used (western digital and seagate (both using TDK technology I think, but not sure)). In future I will order identical disks at different vendors.
As regards filesystems, I would choose XFS for integrity.
Of course the other main insurance is to keep separate backups offline!
HTH, Dave
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/13/2017 12:49 PM, Paul Neuwirth wrote:
In future I will order identical disks at different vendors.
The beauty of MDRaid is you don't have to have identical disks. They don't even have to be of the same interface type, but you'd probably want that for sanity sake. You'd expect to use only as much on each disk as the smallest disk in the raid. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 14/12/17 02:26, John Andersen wrote:
On 12/13/2017 12:49 PM, Paul Neuwirth wrote:
In future I will order identical disks at different vendors.
The beauty of MDRaid is you don't have to have identical disks.
They don't even have to be of the same interface type, but you'd probably want that for sanity sake. You'd expect to use only as much on each disk as the smallest disk in the raid.
Talking about md, yes that is true unless you're using one of the 0 raids. But it's actually quite common (or so it seems) for people to upgrade disk storage by replacing disks one by one. Given the OP's scenario yes I would recommend raid-6 over four disks. Yes that *will* survive a double disk failure unlike raid-10 which has a 30% chance of losing the array. Yes you MUST scrub your array regularly - but NEVER run a "fix" scrub! If scrub reports errors then there's a program called (I think) raid6restore, which will find and fix a broken block. A repair scrub just recalculates parity which, although it's usually parity that's broken, if it's a data block that's broken then your data has just been toasted :-( As for integrity checking your data, yes I'd be interested ... most file systems seem to be more interested in protecting their own integrity, metadata is checksummed but rarely data :-( By all means use btrfs over md-raid, btrfs-raid is severely experimental it seems. zfs-raid apparently works (I wouldn't know). And yes, make sure your eSATA supports port splitting - I've been looking for that sort of thing too and a LOT of stuff out there doesn't. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, 14 Dec 2017 14:08:20 +0000 Wols Lists <antlists@youngman.org.uk> wrote:
zfs-raid apparently works (I wouldn't know).
Yes, it does, though ZFS needs SSD cache for metadata as well as the spinning rust for the data if you actually want performance. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Thanks for all the replies in this thread, one final question about fault resistance on file system formats against block errors on the same physical disk. Is there no fsutil parameter when creating say ext file systems or any other non-complicated file system these days that for example would in their most simple for write two bytes instead of just one for every byte or similar foolish ideas I can come up with just right now. Two bytes consecutively or two bytes even randomly placed on the physical disk (but then you would need some kind of look-up map or directory for that again I guess). You get the idea. Filesystem tweak or fine tuning for writing redundancy onto this disk for better block error resiliency. Thanks for any hints and ideas. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 21/12/17 19:17, cagsm wrote:
Thanks for all the replies in this thread, one final question about fault resistance on file system formats against block errors on the same physical disk. Is there no fsutil parameter when creating say ext file systems or any other non-complicated file system these days that for example would in their most simple for write two bytes instead of just one for every byte or similar foolish ideas I can come up with just right now. Two bytes consecutively or two bytes even randomly placed on the physical disk (but then you would need some kind of look-up map or directory for that again I guess). You get the idea. Filesystem tweak or fine tuning for writing redundancy onto this disk for better block error resiliency. Thanks for any hints and ideas.
Don't bother? At the disk level, bear in mind that a disk nowadays is a small computer in its own right. In the old (pre-ATA - that's your old parallel interface) days, the kernel (or rather the driver) would explicitly tell your drive which Cylinder, Head and Sector to use. Now in the days of LBA, stuff gets moved around to avoid bad spots and the drive has all sorts of error correction built in. If it gives up then it's likely either it's hit a manufacturing defect, or your platters are beginning to disintegrate. As an example of that error correction, I remember the company I worked for buying a HUGE (800MB - that's not huge nowadays!) drive. It had the frontage of a full-height 5.1/4 drive (modern DVDs are half-height) and was about 2 foot deep. One thing I picked up was its error correction involved writing two bytes to disk for every byte the computer asked. I don't remember the details, but if you had a single-bit-flip error it could work out whether the data byte or check byte was wrong, and correct it. If you had a double-bit-flip error there was a 90% chance it could work it out. Modern drives almost certainly have that. If you've got raid-6 you can recover from any single-disk corruption/failure - just make sure you run regular scrubs to detect it. And when you look at filesystems, check to see whether they protect EVERYTHING, or just the metadata. Most kernel/filesystem developers seem to concentrate on filesystem metadata, reasoning that the most important thing is to get the computer back up and running asap. imho that's actually arse-about-face - there's no point being able to boot the computer quicker (getting it back to ops staff), if they then have to run a data integrity check before giving it back to the users! Just look for a filesystem that does a checksum or similar on the *data* so it can detect corruption. You'll probably have to switch it on because it will damage performance and be disabled by default. I want to make that an option for raid, so that does an integrity check and will return a read error if there's an integrity failure. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/21/2017 01:22 PM, Wols Lists wrote:
If you've got raid-6 you can recover from any single-disk corruption/failure - just make sure you run regular scrubs to detect it.
One tiny correction: with RAID-6 you can recover from any two failed disks. RAID-5 allows recovery from any single-disk failure. Regards, Lew -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 21/12/17 22:55, Lew Wolfgang wrote:
On 12/21/2017 01:22 PM, Wols Lists wrote:
If you've got raid-6 you can recover from any single-disk corruption/failure - just make sure you run regular scrubs to detect it.
One tiny correction: with RAID-6 you can recover from any two failed disks. RAID-5 allows recovery from any single-disk failure.
Yes but :-) Raid 5 will not let you recover from corruption. Raid 6 will, but only one disk ... Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/21/2017 03:07 PM, Wols Lists wrote:
On 21/12/17 22:55, Lew Wolfgang wrote:
On 12/21/2017 01:22 PM, Wols Lists wrote:
If you've got raid-6 you can recover from any single-disk corruption/failure - just make sure you run regular scrubs to detect it. One tiny correction: with RAID-6 you can recover from any two failed disks. RAID-5 allows recovery from any single-disk failure.
Yes but :-)
Raid 5 will not let you recover from corruption. Raid 6 will, but only one disk ...
Please explain, you lost me there. Regards, Lew -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 21/12/17 23:14, Lew Wolfgang wrote:
On 12/21/2017 03:07 PM, Wols Lists wrote:
On 21/12/17 22:55, Lew Wolfgang wrote:
On 12/21/2017 01:22 PM, Wols Lists wrote:
If you've got raid-6 you can recover from any single-disk corruption/failure - just make sure you run regular scrubs to detect it. One tiny correction: with RAID-6 you can recover from any two failed disks. RAID-5 allows recovery from any single-disk failure.
Yes but :-)
Raid 5 will not let you recover from corruption. Raid 6 will, but only one disk ...
Please explain, you lost me there.
Raid 5 has one parity disk. It will therefore let you reconstruct one piece of missing information. If you know disk 2 has failed, you can reconstruct the contents. Raid 6 has TWO parity disks, so you can reconstruct TWO pieces of missing information. If you know disks 2 and 5 have failed you can reconstruct them both. But if you know ONE drive (but you don't know which one) has had a hiccup and corrupted your data - maybe a write got lost, maybe (and this apparently does happen) the drive firmware wrote a block in the wrong place, you can also reconstruct that! You can work out which drive has been corrupted (the first piece of missing information) and what the data should have been (the second piece of missing information). There's a program (raid6check, iirc) which will recover that for you. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/13/2017 08:29 AM, cagsm wrote:
My use case here is also not constant availablity, when a disk needs to be replaced, so be it, but I don't want to lose my data that I can not ever read certain parts of it again or such stuff.
Unless you just have spare disks around and want to play with the additional raid levels, a RAID-1 pair is a great start. With mdraid, and RAID-1, unless both drives go at the same time (something I haven't experienced in 17 years), simply fail/remove the dead one, and then add the new one in it's place and it will automatically sync. Another consideration is 'scrubbing' and disk size. With platters, it takes about an hour per Terabyte to scrub. (which is recommended weekly) I have 4T (8T total in 4 disks in 2 RAID-1 sets) and it takes about 4.5 hours to scrub. I'm not sure I would want to have to scrub 20T weekly. And always remember, RAID is just a tool, not a guarantee of data safety. Data not backed up is data lost. -- David C. Rankin, J.D.,P.E. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Revisiting this thread about robut file systems and what to use these days on disk storage: <https://lists.opensuse.org/opensuse/2017-12/msg00433.html> I recently read again on btrfs and their checksum feature also for data, not just mere metadata. So they write that with very recent kernels like 5.5 or even 5.10, there would be some hashsum features enabled in the kernel that btrfs would make use of for data checksum and integrity and all. Is opensuse leap 15.2 currently able to create btrfs on disk with the data checksum feature enabled as well? What would I need to do to make this happen? Current leap has some older 5.3 linux kernel. Anyone using nifty btrfs data checksum features with current opensuse leap versions? TY.
On Mon, Dec 21, 2020 at 4:00 PM cagsm <cumandgets0mem00f@gmail.com> wrote:
Revisiting this thread about robut file systems and what to use these days on disk storage: <https://lists.opensuse.org/opensuse/2017-12/msg00433.html>
I recently read again on btrfs and their checksum feature also for data, not just mere metadata. So they write that with very recent kernels like 5.5 or even 5.10, there would be some hashsum features enabled in the kernel that btrfs would make use of for data checksum and integrity and all.
btrfs had checksums for data from the day one. I have no idea what you mean.
Is opensuse leap 15.2 currently able to create btrfs on disk with the data checksum feature enabled as well? What would I need to do to make this happen? Current leap has some older 5.3 linux kernel. Anyone using nifty btrfs data checksum features with current opensuse leap versions?
TY. _______________________________________________ openSUSE Users mailing list -- users@lists.opensuse.org To unsubscribe, email users-leave@lists.opensuse.org List Netiquette: https://en.opensuse.org/openSUSE:Mailing_list_netiquette List Archives: https://lists.opensuse.org/archives/list/users@lists.opensuse.org
On Mon, Dec 21, 2020 at 4:00 PM cagsm <cumandgets0mem00f@gmail.com> wrote:
Revisiting this thread about robut file systems and what to use these days on disk storage: <https://lists.opensuse.org/opensuse/2017-12/msg00433.html> btrfs had checksums for data from the day one. I have no idea what you mean.
My understanding was that they first only had metadata checksums and only crc32 or something they wrote about, and in these days they go for better checksum algo and for data itself. Mybad? So can I happily go for btrfs and restore my data even if my single physical disk has somewhat of an outage, defective blocks? Or what is exactly covered and handled by this checksums for data bytes itself? Thanks for explanation and thread.
On Mon, Dec 21, 2020 at 4:53 PM cagsm <cumandgets0mem00f@gmail.com> wrote:
On Mon, Dec 21, 2020 at 4:00 PM cagsm <cumandgets0mem00f@gmail.com> wrote:
Revisiting this thread about robut file systems and what to use these days on disk storage: <https://lists.opensuse.org/opensuse/2017-12/msg00433.html> btrfs had checksums for data from the day one. I have no idea what you mean.
My understanding was that they first only had metadata checksums and
That's incorrect.
only crc32 or something they wrote about, and in these days they go for better checksum algo and for data itself.
Originally only crc32c was implemented, currently btrfs supports additionally xxhash64, sha256, blake2b. What is "better" or "worse" depends on your criteria. Personally I think that for detecting random corruption crc32 is probably good enough and no hash function is immune to collisions.
Mybad? So can I happily go for btrfs and restore my data even if my single physical disk has somewhat of an outage, defective blocks?
To restore defective blocks you need a good copy of these blocks. btrfs checksum enables btrfs to detect data corruption; it is not replacement for a second good copy of data. On a single disk you can use dup profile for metadata and/or data to have two copies.
Or what is exactly covered and handled by this checksums for data bytes itself?
Huh? Checksum for data bytes covers integrity of data bytes. It allows btrfs to detect data corruption and to avoid silently returning corrupted data to application. If there is a second (or third) good copy of the same data, btrfs will use it instead and will replace the corrupted copy by a good copy (not sure if it happens automatically on read though, scrub does it).
participants (9)
-
Andrei Borzenkov
-
cagsm
-
Carlos E. R.
-
Dave Howorth
-
David C. Rankin
-
John Andersen
-
Lew Wolfgang
-
Paul Neuwirth
-
Wols Lists