David Haller said the following on 04/18/2011 03:45 PM:
Hello,
sorry for my delay ...
On Tue, 12 Apr 2011, Anton Aylward wrote:
David Haller said the following on 04/12/2011 09:04 AM:
Let's say I'd use LVM, what if e.g. /dev/sde shows defects (seen in the SMART data). Can LVM show me what dirs and files(!) I have on the PV(s) on that probably soon-failing drive?
I think you are asking the wrong question.
Am I? What'd you do if you see a SMART failure for Sector 1234567 on device /dev/sdc?
Nothing. Its also telling me its remapped the drive. By the time you see that message the drive firmware has see soft errors and tried correcting them, seen it happen a few times and become unhappy with that sector and remapped it to one of the 'spare' ones for recovery. Back in the days of SCO XENIX on the PDP-11 I wrote an error correcting disk river and saw it made redundant by on-board drivers for SCSI and later; *sigh* it was a crazy hack, having to run on the kernel stack and deferring the the logical disk access while did CRC calculations between interrupts and if required did the remapping. http://www.travelnotes.de/california/silicon/oldpdp.htm My laptop's smartd does in fact report some bum sectors after all these years. Laptops are not as amenable to RAID as desktops and servers. Does it bother me? Yes and no. No, I expect it after a few years. Every disk has its bathtub. Yes, it tells me to get a new laptop :-)
Wouldn't you want to replay the backup of the affected file _first_ (to another drive), so you have the file in duplicate again? Your backup _can_ fail too, and after Murphy, it will. c.f.:
We can play *that* mindgame into infinite regression. We can also argue about whether you bother backing up "system software" that can be reinstalled. We can argue about how you run integrity checks.
RAID: One more disk fails than can be recovered by the redundancy. -- Andreas Dau
If that drive is failing and and all files in any and all file systems in any and all LVs on any and all PVs on that drive are at risk. You don't need to know the individual file names.
Exactly my point! Which PVs and which LVs? Without RAID and LVM (and with ext2/3/4), you have _only one_ file (the one using that failed sector) affected until the drive actually dies. And (with debugfs) I can find out what file is affected.
I think you have an incomplete understanding of how LVM works. You might as well make the same argument about any system of partitioning. All LVM is doing is introducing one more level of indirection. That applies to access as well as debug. With 'hard sectors' you'd still have to build a complete map between files and physical sectors. With 'soft sectors' of LVM you build a complete map between files and logical sectors in just the same way -- then using the tables that the LVM tools supply, map those to the logical sectors. Personally I think the mapping of the files is the big job :-) Has it been done? Years ago I did something like this in an attempt to come up with a de-fragmentation application; I had it working for some special cases but became frustrated and showed the the backers that a backup-mkfs-restore was quicker and more general. I hear that some people are working on sector-level deduplication -- finding common sector images between files and 'linking them', and developing a file system to support that. From your POV having more than one file using a sector that goes bad is a high risk, but their scanning tools for conversion might be of interest. Yes its all computationally heavy, but you're the one that
And of course, you replace the drive as soon as you can.
In an ideal world. In a real world you don't do a Chicken Little on the first error report. Check your drive specs. and don't forget that unless you have a mainframe-grade environmental enclosure for your drives, the chassis, cabling and power supply are all as unreliable as the drives. My laptop's drive has lasted longer than two batteries, a screen and a touchpad. On my file/mail server, the one that sees most use, I've had two motherboards die, and the video of a third, closely followed by the internal Ethernet. Disk drives have not been a problem :-)
But you still might want to reduplicate the (now) single ("current") instance of that file in the backup ASAP, so that there, again, is a redundancy between backup and live system. A file only in the backup is not a backup.
That is one reason I use LVM. It makes backups - snapshots - so much easier.
Thinking you have to backup/restore or move those files or file systems is "fixed partition" thinking. LVM thinking is tell the LVM system to stop using /dev/sde and to migrate the LVs (and hence the file systems and hence the files) to another PV.
And how do you know that the affected file will not be "migrated" to that "other" PV incorrectly? Will you get (and see) logs that that file caused an disk-IO-Error and that it is corrupted?
That gets back to indefinite regression argument ... is it real or is it Emulex? Which is the correct one of two mirrored drives? We could go on forever with this game.
LVM is more KISS than all those fixed size partitions.
Do you _REALLY_ understand LVM (and a possible underlying RAID) to the point to fix problems from a rescue system with
- a hex-editor (e.g. vche or debug.com) - e2fsck [-b SUPERBLOCK] - debugfs
No, because that's not the way to fix things. Its only *A* way. You're asking a question that presupposes your answer and your POV is the correct one, and that things are structured in a way that is suitable to those tools. I recall one seeing one kindergarten child ask another "What would you rather do, eat a box of chalk or drink a jar of paint?" RAID you say? Can I pick PAID1 - Mirrors? Can I then put the two drives in another box, mount the file systems and then do a side-by-side tree-walk to see where files differ? No need for he editors ...
I DO on my system!
Good for you. I don't.
But for resilience and flexibility, using LVM beats out using fixed location and fixed size partitions.
I'm quite happy with my setup. I know and understand symlinks and 'mount --bind' intrinsically, with DOS-partitions down to the bit.
And there's nothing to stop you using that on top of LVM
And: symlinks work over NFS (once the NFS-stuff is mounted). ~10T of my stuff is in the other box, mounted via NFS.
YMMV, of course!
Mine is driven by two things. The first is that LVM doesn't care wow big the drives are. Well neither does Btrfs, if you want to have just "/" and 'pretend to have mounted partitions, that may be a good choice as well. I've been playing with Btrfs on one machine and it seems OK. I'll watch how it evolves; it may do very well for SSD, particularly on a laptop in place of LVM. Because LVM doesn't care about drive size I can mirror and stripe in ways I can't with traditional RAID. The second is backup. LVM lets me take snapshots. It make disk-to-disk-to-tape or in my case disk-to-disk-to-DVD very easy. The easier a backup is the less likely one is to skip it. And those DVDs have usable file systems so making finding an old rile and restoring it easy. I can do other things with LVM. I can plug in an external drive, make it part of the group, snapshot to it. Fast backup! Faster than a file system walk. And the backup is a usable file system. Obsessing about disk failure when using a system that has so many other failure modes that statistics show are equally likely or more likely strikes me as foolish. In the 'real world' I've been using AIX since it was first released; I was using LVM and RAID on large AIX multiprocessor rigs long before LVM was available or Linux and was impressed with it. -- The emphasis should be on "why" we do a job - W. Edwards Deming -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org