On Wed, 04 Sep 2013 19:48:52 -0400, Jeff Mahoney wrote:
I don't think it's "overly conservative" though to be cautious about risking your data. I think it's up to those who believe it's stable to demonstrate that it is, and to assure the users that this is a safe filesystem to use.
If it is, then of course, I'd want to use it. But I don't want to take a bigger risk so there can be more data gathered about unrecoverable problems. Does that make sense?
I agree that it's not overly conservative to want to preserve your data. That's the baseline of what you expect from a file system. What I do object to is people saying "no" without having an actual reason for saying so other than being worried about it. Worry is fine, but depending on worries as a data set is problematic.
That's fair, but not everyone has the depth of knowledge of btrfs that you and the other developers do. So the actual reasons tend to be "I heard that" rather than personal experience. That's why a discussion like this is important to have - it gives you the chance to explain why the perceptions are incorrect now, and a chance for those who aren't into the guts of filesystem development to understand how things are progressing. I've been around online support communities long enough to know that (as I said earlier) we generally don't get reports of "everything's fine", so the data tends to be a bit skewed and highly anecdotal. Not everyone has that view on online support, though - as evidenced by the number of people who come in and find a thread with their specific problem and decry the testers (not realizing they /are/ part of the testing team in a way) for not finding their specific problem before they did.
What are the next steps after taking the filesystem offline in this instance?
Like every file system, it depends on the error case. This is the same reaction to errors that xfs, ext3, ext4, reiserfs, etc have when they don't want to risk corrupting your data. If it's a disk failure or media access error, that needs to be corrected. If it's corruption, that also needs to be corrected. If it's an ENOMEM, that gets more difficult and that's an area in which I want to invest more effort into avoiding, but it's low priority right now. For the most part, though, the allocation requests are small. I've never seen an ENOMEM in the middle of btrfs actually happen.
In most cases, if the error can be corrected without a reboot, a simple umount <fix> remount cycle will be enough.
Makes sense, thanks. :)
Yeah, we definitely plan on doing wider stress tests in the coming months.
Great. :)
From where I sit, the expectation isn't to eliminate the possibility of errors - but to look for something that's maybe one step better than "good enough".
Agreed. This is an area where we plan to invest more effort.
That's also good to hear.
Would it make a difference if one used a preallocated disk image rather than a dynamic image?
No. The issue isn't the initial allocation, it's the CoW for writes into the image file.
Ah, I think I understand now.
That sounds like a cool feature. Has anyone played with this on, say, truecrypt encrypted devices as yet? (I have a very large truecrypt encrypted volume that I know has some duplication of data on it, and scripting to remove the duplicates, while not difficult, is something I haven't taken the time to do yet.
Not specifically, AFAIK.
OK, maybe if I have some time in the coming weeks, I can create a small volume and try formatting it with btrfs to see if it works (the nice thing about truecrypt is the file-based container option for doing tests like this).
Most of that is over my head - what's the bottom line/impact on this?
The bottom line is that w/o this enabled, things that use a lot of hard links in a single directory can run into EMLINK. You can fix the issue by enabling the extended inode ref feature, but it means that you can mount the file system on older kernels. "Older" in this case means prior to 3.7 IIRC, so oS 12.3.
Ah, OK.
Some general performance numbers would be good to see - as well as performance on large files/small files.
Once we have something we can publish, I'll be happy to share them.
That sounds good.
The baseline off-the-cuff performance shows that performance is similar to ext3 for some workloads, and way off in others, specifically those that are unlink-heavy.
Interesting - is that because of the snapshots, or something else (I honestly haven't looked too deeply at the snapshot functionality, so may need to set up a test system to play with this on a bit).
It's good to hear success stories. I'm curious - do you back this up, or is most of the data available elsewhere in the event of an unrecoverable issue? (Of course, "unrecoverable" for you is probably different than it would be for me, since you know the filesystem well enough to manually work on it if necessary).
I don't have backups for anything but time machine bundles, my mail mirror, and photos. The music and videos can be reproduced in a time-consuming manner. Mostly it's a matter of the price of backup space being more expensive than I want to spend. But, yeah, I do have the luxury of knowing where to start to fix it manually if I must.
Similar situation to what I'm in with my 2 TB external drive, it sounds like.
Given that the work for btrfs in SLE and openSUSE is being handled largely by the same people, I think it makes sense to make the comparison.
SLE doesn't yet default to btrfs, though, does it?
SLE11 defaults to ext3 and we don't change the default in a service pack. I can't comment on what the default in SLE12 will be. I'll refer questions about that to our product manager for SLES, Matthias Eckermann. It should be apparent that SUSE is invested in the success of btrfs, though.
Yep, absolutely. openSUSE is the upstream so it does become a significant part of the testbed for SLE as well, so there's certainly a balance to be achieved.
It's good to have the conversation - thank you for the detailed explanation of things. That really helps put my mind at ease that this isn't (as we see from time to time) a "throw it over the wall and see what breaks" approach. It sounds like you've really done your homework and stand behind what you and your team have done to make btrfs production quality. While I still have reservations (and probably will until it reaches some sort of critical mass), my concerns are largely addressed.
Exactly. This is a file system which we've seen deployed with SLE11 SP2/3 and with which we've seen pretty good results. It's also something that we put significant effort into improving even after the initial release of the service pack.
That's good to hear. :) Jim -- Jim Henderson Please keep on-topic replies on the list so everyone benefits -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org