[opensuse-factory] BtrFS as default fs?
Hi all - Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it. I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable. A quick table of what that looks like: Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only) Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature. That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon. One other aspect to consider: Even though they are independent projects, we've been focusing heavily on btrfs support in the SLES product. As a result, the openSUSE kernel will end up getting much of that work 'for free' since most of the same people maintain the kernel for both projects. So that's the "why it's safe" part of the proposal. I haven't gotten to the "why" yet, but then you probably already know the "whys". Subvolumes. Built-in snapshots that don't corrupt themselves when an exception table runs out of space. Built-in integrity verification via checksums. Built-in proactive metadata semantic checking via scrubbing. Online defrag. Soon we'll see online deduplication of arbitrary combinations of files. The code is written, it just needs to be pulled in. You've seen the rest of the feature set. Once we test more of it under load and ensure that it's mature enough to roll out, you'll get those features for free. So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta. Thanks for your time. -Jeff -- Jeff Mahoney SUSE Labs
Le mardi 03 septembre 2013 à 10:32 -0400, Jeff Mahoney a écrit :
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
My main worry with your proposal is the upgrade path, since I would expect some people having enabled some options (compression, autodefrag) on install and forgetting to disable it (or to set the allow_unsupported flag) when doing a "zypper dup" upgrade, ending up with a unbootable system.. (After all, I was using compression until you said it was unstable ;) Maybe some "trigger" trick should be added when upgrading to the "kernel with unsupported flag", creating a drop-in file in /etc/modprobe.d when unsupported flags are detected in /etc/fstab and displaying a warning (and a link to a webpage giving hints on how to revert to a supported configuration).
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
I support this ! -- Frederic Crozat <fcrozat@suse.com> SUSE -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 10:41 AM, Frederic Crozat wrote:
Le mardi 03 septembre 2013 à 10:32 -0400, Jeff Mahoney a écrit :
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
My main worry with your proposal is the upgrade path, since I would expect some people having enabled some options (compression, autodefrag) on install and forgetting to disable it (or to set the allow_unsupported flag) when doing a "zypper dup" upgrade, ending up with a unbootable system.. (After all, I was using compression until you said it was unstable ;)
Of course. We definitely don't want to break existing users.
Maybe some "trigger" trick should be added when upgrading to the "kernel with unsupported flag", creating a drop-in file in /etc/modprobe.d when unsupported flags are detected in /etc/fstab and displaying a warning (and a link to a webpage giving hints on how to revert to a supported configuration).
Yeah, this is definitely doable. Can we get the YaST team on board with adding that notification and support?
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
I support this !
Thanks! -Jeff -- Jeff Mahoney SUSE Labs
Am 03.09.2013 16:44, schrieb Jeff Mahoney:
Maybe some "trigger" trick should be added when upgrading to the "kernel with unsupported flag", creating a drop-in file in /etc/modprobe.d when unsupported flags are detected in /etc/fstab and displaying a warning (and a link to a webpage giving hints on how to revert to a supported configuration).
Yeah, this is definitely doable. Can we get the YaST team on board with adding that notification and support?
Wouldn't it be enough to have the kernel panic with a descriptive message when the allow_unsupported is needed for the rootfs but not set? Seems less work :-) And I personally don't like auto-enabling of "dangerous" features without my interaction. But that's just my opinion. -- Stefan Seyfried "If your lighter runs out of fluid or flint and stops making fire, and you can't be bothered to figure out about lighter fluid or flint, that is not Zippo's fault." -- bkw -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 03, 2013 at 04:54:46PM +0200, Stefan Seyfried wrote:
Am 03.09.2013 16:44, schrieb Jeff Mahoney:
Maybe some "trigger" trick should be added when upgrading to the "kernel with unsupported flag", creating a drop-in file in /etc/modprobe.d when unsupported flags are detected in /etc/fstab and displaying a warning (and a link to a webpage giving hints on how to revert to a supported configuration).
Yeah, this is definitely doable. Can we get the YaST team on board with adding that notification and support?
Wouldn't it be enough to have the kernel panic with a descriptive message when the allow_unsupported is needed for the rootfs but not set?
Which might not be the best approach for remotely operated systems.
Seems less work :-)
And I personally don't like auto-enabling of "dangerous" features without my interaction. But that's just my opinion.
Then we need a notification mechanism which is independent from YaST and libzypp as Arvin suggested in his reply. Cheers, Lars -- Lars Müller [ˈlaː(r)z ˈmʏlɐ] Samba Team + SUSE Labs SUSE Linux, Maxfeldstraße 5, 90409 Nürnberg, Germany
On Tue, Sep 03, 2013 at 10:44:14AM -0400, Jeff Mahoney wrote:
On 9/3/13 10:41 AM, Frederic Crozat wrote:
Le mardi 03 septembre 2013 à 10:32 -0400, Jeff Mahoney a écrit :
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
My main worry with your proposal is the upgrade path, since I would expect some people having enabled some options (compression, autodefrag) on install and forgetting to disable it (or to set the allow_unsupported flag) when doing a "zypper dup" upgrade, ending up with a unbootable system.. (After all, I was using compression until you said it was unstable ;)
Of course. We definitely don't want to break existing users.
Maybe some "trigger" trick should be added when upgrading to the "kernel with unsupported flag", creating a drop-in file in /etc/modprobe.d when unsupported flags are detected in /etc/fstab and displaying a warning (and a link to a webpage giving hints on how to revert to a supported configuration).
Yeah, this is definitely doable. Can we get the YaST team on board with adding that notification and support?
YaST doesn't run when using "zupper dup". The right place for such functionality is a RPM script from my POV. Regards, Arvin -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, 3 Sep 2013, Frederic Crozat wrote:
Le mardi 03 septembre 2013 ? 10:32 -0400, Jeff Mahoney a ?crit :
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
My main worry with your proposal is the upgrade path, since I would expect some people having enabled some options (compression, autodefrag) on install and forgetting to disable it (or to set the allow_unsupported flag) when doing a "zypper dup" upgrade, ending up with a unbootable system.. (After all, I was using compression until you said it was unstable ;)
Maybe some "trigger" trick should be added when upgrading to the "kernel with unsupported flag", creating a drop-in file in /etc/modprobe.d when unsupported flags are detected in /etc/fstab and displaying a warning (and a link to a webpage giving hints on how to revert to a supported configuration).
I really wonder why this is a kernel module parameter and not a mount option. Is it so that you consider the unsupported features "running" on one filesystem "corrupting" another filesystem that does not use the unsupported features? In the table I also miss a checkbox whether a (un-)supported feature manifests itself in the FS metadata in a way that the FS will not be mountable (R/O?) with the feature "turned off". That is, can the feature set of a filesystem be autodetected at mount time (thus even making the mount option unnecessary as you can simply flag a whole filesystem as unsupported?) As developer whose main filesystem load is compiling GCC I'd be interested in a speed comparison to ext3 (what I happen to be using). I suppose I cannot rely on the btrfs module for this as we ship it in the 3.0.13-0.27-default SLE11 kernel? Thanks, Richard. -- Richard Biener <rguenther@suse.de> SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Le mardi 03 septembre 2013 à 16:41 +0200, Frederic Crozat a écrit :
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
I support this !
And just some additional advertisement to btrfs / snapper, because it saved my "Factory" day today: - upgraded my Factory system to today Factory packages (was running yesterday Factory) - rebooted, got issues with Xorg (either not starting or no mouse cursor visible under GNOME). - there were a lot of packages updated, making it difficult to locate which one was guilty - and with Factory, if a package is broken, you either wait or try to revert to snapshot, which isn't always easy - in the end, I just reverted my system to the snapshot taking by snapper right before the update, with "snapper undochanges snapshot_number_before_update..snapshot_number_after_update" - rebooted - got back a working system ;) -- Frederic Crozat <fcrozat@suse.com> SUSE -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Thu 12 Sep 2013 09:26:34 AM EDT, Frederic Crozat wrote:
Le mardi 03 septembre 2013 à 16:41 +0200, Frederic Crozat a écrit :
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
I support this !
And just some additional advertisement to btrfs / snapper, because it saved my "Factory" day today: - upgraded my Factory system to today Factory packages (was running yesterday Factory) - rebooted, got issues with Xorg (either not starting or no mouse cursor visible under GNOME). - there were a lot of packages updated, making it difficult to locate which one was guilty - and with Factory, if a package is broken, you either wait or try to revert to snapshot, which isn't always easy - in the end, I just reverted my system to the snapshot taking by snapper right before the update, with "snapper undochanges snapshot_number_before_update..snapshot_number_after_update" - rebooted - got back a working system
;)
+1 Cheers! Roman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 03, Jeff Mahoney wrote:
Supported Unsupported --------- ----------- Snapshots Inode cache
In the past the snapshots were enabled automatically. I once installed a VM with just a 20G disk and soon this small disk was filled up after some zypper patch/dup calls. It was not immediately obvious why the root filesystem ran out of space. Google pointed to snapper and somehow I managed to remove the (unwanted) snapshots. Unless the snapshot handling has improved to not suddenly fillup the disk during ordinary usage I suggest to disable them per default. Maybe add a big red button in the installer proposal to easily enable/disable them. Up to now it was easy to install a system/VM by just clicking Next/Next/+ and get system that continues to work without further interaction. With enabled snapshots this experience would break. Olaf -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
El 03/09/13 10:32, Jeff Mahoney escribió: In practice, all that's required for those
users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
That's not gonna fly, I will fight that to death, please do not add enterprise crippling module parameters to openSUSE, it *really* does not belong there at all. that's absolutely insane. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 03, 2013 at 10:32:48AM -0400, Jeff Mahoney wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
What about quota groups? Regards, Arvin -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 11:25 AM, Arvin Schnell wrote:
On Tue, Sep 03, 2013 at 10:32:48AM -0400, Jeff Mahoney wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
What about quota groups?
Oops, yep. Quota groups are supported too. -Jeff -- Jeff Mahoney SUSE Labs
On Tue, 3 Sep 2013 16:32, Jeff Mahoney <jeffm@...> wrote: [snip]
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
To spare us all a rush of headaches, a couple of points to add: - Be aware that not every bootloader does supports btrfs for /boot (lilo, syslinux, ...) - For the Yast installer: Select bootloader before partitioning / root-fs selection, be prepared to 'propose' a extra /boot partition for non-btrfs aware bootloaders. Make sure that these points land in the printed manual, and the wiki. - Yamaban. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 09/03/2013 12:26 PM, Yamaban wrote:
On Tue, 3 Sep 2013 16:32, Jeff Mahoney <jeffm@...> wrote: [snip]
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
To spare us all a rush of headaches, a couple of points to add:
- Be aware that not every bootloader does supports btrfs for /boot (lilo, syslinux, ...)
- For the Yast installer: Select bootloader before partitioning / root-fs selection, be prepared to 'propose' a extra /boot partition for non-btrfs aware bootloaders.
Make sure that these points land in the printed manual, and the wiki.
- Yamaban.
Doesn't btrfs support booting the system without a /boot partition? -- Cheers! Roman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 2:22 PM, Roman Bysh wrote:
On 09/03/2013 12:26 PM, Yamaban wrote:
On Tue, 3 Sep 2013 16:32, Jeff Mahoney <jeffm@...> wrote: [snip]
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
To spare us all a rush of headaches, a couple of points to add:
- Be aware that not every bootloader does supports btrfs for /boot (lilo, syslinux, ...)
- For the Yast installer: Select bootloader before partitioning / root-fs selection, be prepared to 'propose' a extra /boot partition for non-btrfs aware bootloaders.
Make sure that these points land in the printed manual, and the wiki.
- Yamaban.
Doesn't btrfs support booting the system without a /boot partition?
Yes, but his point was that while GRUB2 supports it (and GRUB1 with patches, I forget if we're carrying those), other boot loaders may not and YaST should be intelligent enough to know which ones don't. That's where the separate /boot comes in. -Jeff -- Jeff Mahoney SUSE Labs
On Tue, 3 Sep 2013 20:22, Roman Bysh <rbtc1@...> wrote:
On 09/03/2013 12:26 PM, Yamaban wrote:
On Tue, 3 Sep 2013 16:32, Jeff Mahoney <jeffm@...> wrote: [snip]
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
To spare us all a rush of headaches, a couple of points to add:
- Be aware that not every bootloader does supports btrfs for /boot (lilo, syslinux, ...)
- For the Yast installer: Select bootloader before partitioning / root-fs selection, be prepared to 'propose' a extra /boot partition for non-btrfs aware bootloaders.
Make sure that these points land in the printed manual, and the wiki.
- Yamaban.
Doesn't btrfs support booting the system without a /boot partition?
It's a matter of the bootloader. If the bootloader supports btrfs for /boot, e.g grub2, gummiboot, ... no extra partition for /boot needed, all other bootloader will NEED a /boot partiton that is NOT btrfs, but e.g. ext[234]-fs. For at least another 7-12 years boot via BIOS will matter. And not all BIOS capable bootloader support btrfs for /boot. Personally, I have a well founded HATE for grub and grub2, and I'm not alone in that. - Yamaban. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
El 03/09/13 15:08, Yamaban escribió:
For at least another 7-12 years boot via BIOS will matter. And not all BIOS capable bootloader support btrfs for /boot.
Did you mean UEFI ? it only supports FAT32 AFAIK. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, 3 Sep 2013 21:15, Cristian Rodríguez <crrodriguez@...> wrote:
El 03/09/13 15:08, Yamaban escribió:
For at least another 7-12 years boot via BIOS will matter. And not all BIOS capable bootloader support btrfs for /boot.
Did you mean UEFI ? it only supports FAT32 AFAIK.
Sorry, but ARGHH ! With UEFI, the uefi-parameter-data partion will be mounted INSIDE /boot as /boot/efi, AFAIK, and yes the format is fat32. The bootloader is a whole other kettle of fish. Documentation (on booting, bootloaders, BIOS, UEFI) is available online. - Yamaban.
On 09/03/2013 03:08 PM, Yamaban wrote:
On Tue, 3 Sep 2013 20:22, Roman Bysh <rbtc1@...> wrote:
On 09/03/2013 12:26 PM, Yamaban wrote:
On Tue, 3 Sep 2013 16:32, Jeff Mahoney <jeffm@...> wrote: [snip]
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
To spare us all a rush of headaches, a couple of points to add:
- Be aware that not every bootloader does supports btrfs for /boot (lilo, syslinux, ...)
- For the Yast installer: Select bootloader before partitioning / root-fs selection, be prepared to 'propose' a extra /boot partition for non-btrfs aware bootloaders.
Make sure that these points land in the printed manual, and the wiki.
- Yamaban.
Doesn't btrfs support booting the system without a /boot partition?
It's a matter of the bootloader.
If the bootloader supports btrfs for /boot, e.g grub2, gummiboot, ... no extra partition for /boot needed, all other bootloader will NEED a /boot partiton that is NOT btrfs, but e.g. ext[234]-fs.
For at least another 7-12 years boot via BIOS will matter. And not all BIOS capable bootloader support btrfs for /boot.
Personally, I have a well founded HATE for grub and grub2, and I'm not alone in that.
- Yamaban.
It may be time for a better bootloader than Grub and Grub2. Cheers! Roman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
One other aspect to consider: Even though they are independent projects, we've been focusing heavily on btrfs support in the SLES product. As a result, the openSUSE kernel will end up getting much of that work 'for free' since most of the same people maintain the kernel for both projects.
So that's the "why it's safe" part of the proposal. I haven't gotten to the "why" yet, but then you probably already know the "whys". Subvolumes. Built-in snapshots that don't corrupt themselves when an exception table runs out of space. Built-in integrity verification via checksums. Built-in proactive metadata semantic checking via scrubbing. Online defrag. Soon we'll see online deduplication of arbitrary combinations of files. The code is written, it just needs to be pulled in. You've seen the rest of the feature set. Once we test more of it under load and ensure that it's mature enough to roll out, you'll get those features for free.
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Thanks for your time.
-Jeff
Not as long as any items are in the unsupported colume and as long as there is no tool to repair a broken filesystem. -- Ken Schneider SuSe since Version 5.2, June 1998 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote:
On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
One other aspect to consider: Even though they are independent projects, we've been focusing heavily on btrfs support in the SLES product. As a result, the openSUSE kernel will end up getting much of that work 'for free' since most of the same people maintain the kernel for both projects.
So that's the "why it's safe" part of the proposal. I haven't gotten to the "why" yet, but then you probably already know the "whys". Subvolumes. Built-in snapshots that don't corrupt themselves when an exception table runs out of space. Built-in integrity verification via checksums. Built-in proactive metadata semantic checking via scrubbing. Online defrag. Soon we'll see online deduplication of arbitrary combinations of files. The code is written, it just needs to be pulled in. You've seen the rest of the feature set. Once we test more of it under load and ensure that it's mature enough to roll out, you'll get those features for free.
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Thanks for your time.
-Jeff
Not as long as any items are in the unsupported colume and as long as
The unsupported features might as well be "unimplemented" for the purposes of this discussion.
there is no tool to repair a broken filesystem.
There is a btrfsck tool. Have you encountered a file system it was unable to repair? Bugzilla IDs? The tool can only improve with the reporting of different types of corruption. Even e2fsck still receives regular updates. -Jeff -- Jeff Mahoney SUSE Labs
On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote:
On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote:
On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
One other aspect to consider: Even though they are independent projects, we've been focusing heavily on btrfs support in the SLES product. As a result, the openSUSE kernel will end up getting much of that work 'for free' since most of the same people maintain the kernel for both projects.
So that's the "why it's safe" part of the proposal. I haven't gotten to the "why" yet, but then you probably already know the "whys". Subvolumes. Built-in snapshots that don't corrupt themselves when an exception table runs out of space. Built-in integrity verification via checksums. Built-in proactive metadata semantic checking via scrubbing. Online defrag. Soon we'll see online deduplication of arbitrary combinations of files. The code is written, it just needs to be pulled in. You've seen the rest of the feature set. Once we test more of it under load and ensure that it's mature enough to roll out, you'll get those features for free.
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Thanks for your time.
-Jeff
Not as long as any items are in the unsupported colume and as long as
The unsupported features might as well be "unimplemented" for the purposes of this discussion.
there is no tool to repair a broken filesystem.
There is a btrfsck tool.
OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best. -- Ken Schneider SuSe since Version 5.2, June 1998 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 4:56 PM, Ken Schneider - openSUSE wrote:
On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote:
On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote:
On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
One other aspect to consider: Even though they are independent projects, we've been focusing heavily on btrfs support in the SLES product. As a result, the openSUSE kernel will end up getting much of that work 'for free' since most of the same people maintain the kernel for both projects.
So that's the "why it's safe" part of the proposal. I haven't gotten to the "why" yet, but then you probably already know the "whys". Subvolumes. Built-in snapshots that don't corrupt themselves when an exception table runs out of space. Built-in integrity verification via checksums. Built-in proactive metadata semantic checking via scrubbing. Online defrag. Soon we'll see online deduplication of arbitrary combinations of files. The code is written, it just needs to be pulled in. You've seen the rest of the feature set. Once we test more of it under load and ensure that it's mature enough to roll out, you'll get those features for free.
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Thanks for your time.
-Jeff
Not as long as any items are in the unsupported colume and as long as
The unsupported features might as well be "unimplemented" for the purposes of this discussion.
there is no tool to repair a broken filesystem.
There is a btrfsck tool.
OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best.
That's why we make the effort of marking some features as immature. The core file system is stable. It's the additional features that need some testing time. They may work fine. We just haven't invested the time to determine which other features are ready and don't want to represent them as such. -Jeff -- Jeff Mahoney SUSE Labs
On 09/03/2013 02:02 PM, Jeff Mahoney wrote:
On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote:
On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
One other aspect to consider: Even though they are independent projects, we've been focusing heavily on btrfs support in the SLES product. As a result, the openSUSE kernel will end up getting much of that work 'for free' since most of the same people maintain the kernel for both projects.
So that's the "why it's safe" part of the proposal. I haven't gotten to the "why" yet, but then you probably already know the "whys". Subvolumes. Built-in snapshots that don't corrupt themselves when an exception table runs out of space. Built-in integrity verification via checksums. Built-in proactive metadata semantic checking via scrubbing. Online defrag. Soon we'll see online deduplication of arbitrary combinations of files. The code is written, it just needs to be pulled in. You've seen the rest of the feature set. Once we test more of it under load and ensure that it's mature enough to roll out, you'll get those features for free.
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Thanks for your time.
-Jeff
Not as long as any items are in the unsupported colume and as long as The unsupported features might as well be "unimplemented" for the
On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote: purposes of this discussion.
there is no tool to repair a broken filesystem. There is a btrfsck tool. OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best. That's why we make the effort of marking some features as immature. The core file system is stable. It's the additional features that need some testing time. They may work fine. We just haven't invested the time to determine which other features are ready and don't want to represent
On 9/3/13 4:56 PM, Ken Schneider - openSUSE wrote: them as such.
Well, btrfs didn't work for me when trying to configure a single partition of about 18-TB on 12.3. The filesystem could be created, but would crash half way through a "fill-em-up" timing test. I'll try the test again with factory if I can within the next few days. Regards, Lew -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 7:57 PM, Lew Wolfgang wrote:
On 09/03/2013 02:02 PM, Jeff Mahoney wrote:
On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote:
On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
One other aspect to consider: Even though they are independent projects, we've been focusing heavily on btrfs support in the SLES product. As a result, the openSUSE kernel will end up getting much of that work 'for free' since most of the same people maintain the kernel for both projects.
So that's the "why it's safe" part of the proposal. I haven't gotten to the "why" yet, but then you probably already know the "whys". Subvolumes. Built-in snapshots that don't corrupt themselves when an exception table runs out of space. Built-in integrity verification via checksums. Built-in proactive metadata semantic checking via scrubbing. Online defrag. Soon we'll see online deduplication of arbitrary combinations of files. The code is written, it just needs to be pulled in. You've seen the rest of the feature set. Once we test more of it under load and ensure that it's mature enough to roll out, you'll get those features for free.
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Thanks for your time.
-Jeff
Not as long as any items are in the unsupported colume and as long as The unsupported features might as well be "unimplemented" for the
On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote: purposes of this discussion.
there is no tool to repair a broken filesystem. There is a btrfsck tool. OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best. That's why we make the effort of marking some features as immature. The core file system is stable. It's the additional features that need some testing time. They may work fine. We just haven't invested the time to determine which other features are ready and don't want to represent
On 9/3/13 4:56 PM, Ken Schneider - openSUSE wrote: them as such.
Well, btrfs didn't work for me when trying to configure a single partition of about 18-TB on 12.3. The filesystem could be created, but would crash half way through a "fill-em-up" timing test. I'll try the test again with factory if I can within the next few days.
Do you have a bugzilla ID for this? -Jeff -- Jeff Mahoney SUSE Labs
On 09/03/2013 04:59 PM, Jeff Mahoney wrote:
On 9/3/13 7:57 PM, Lew Wolfgang wrote:
On 09/03/2013 02:02 PM, Jeff Mahoney wrote:
On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote:
On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and wrote: > Hi all - > > Last month I posted queries to this list (and several other > locations, > including the forums) asking about people's experiences with > btrfs. For > the most part it seemed like the experience had improved over time. > Most > of the concerns were either with interactions with zypper or old > perceptions of instability that were based more on old impressions > than > new testing. With the exception of an ENOSPC issue that had been > recently fixed, users actively using the file system seemed pretty > satisfied with it. > > I posted a followup question a week or two later asking what people > thought about limiting the 'supported' feature set in the way we > do in > SLES so that it's clear to all users which parts of the file > system are > considered stable. > > A quick table of what that looks like: > > Supported Unsupported > --------- ----------- > Snapshots Inode cache > Copy-on-Write Auto Defrag > Subvolumes RAID > Metadata Integrity Compression > Data Integrity Send / Receive > Online metadata scrubbing Hot add/remove > Manual defrag Seeding devices > Manual deduplication (soon) Multiple devices > "Big" Metadata (supported > read-only) > > Over time this table will change. Items from the Unsupported list > will > move to the Supported list as they mature. > > That proposal was pretty well received except, predictably, by those > using the features listed. In practice, all that's required for those > users to continue uninterrupted is to add the 'allow_unsupported=1' > option to the btrfs module either on the kernel command line or > /etc/modprobe.d. There is nothing inherently limiting to any openSUSE > user with this practice. The features are all still in the code and > available immediately just by setting a flag. It can even be done > safely > after module load or even after file systems that don't use the > unsupported features have been mounted. I intend to introduce this > functionality into openSUSE soon. > > One other aspect to consider: Even though they are independent > projects, > we've been focusing heavily on btrfs support in the SLES product. > As a > result, the openSUSE kernel will end up getting much of that work > 'for > free' since most of the same people maintain the kernel for both > projects. > > So that's the "why it's safe" part of the proposal. I haven't > gotten to > the "why" yet, but then you probably already know the "whys". > Subvolumes. Built-in snapshots that don't corrupt themselves when an > exception table runs out of space. Built-in integrity verification > via > checksums. Built-in proactive metadata semantic checking via > scrubbing. > Online defrag. Soon we'll see online deduplication of arbitrary > combinations of files. The code is written, it just needs to be > pulled > in. You've seen the rest of the feature set. Once we test more of it > under load and ensure that it's mature enough to roll out, you'll get > those features for free. > > So, I'd like to propose that we use btrfs as the default file system > for > the 13.1 release before we release the first beta. > > Thanks for your time. > > -Jeff > > Not as long as any items are in the unsupported colume and as long as The unsupported features might as well be "unimplemented" for the
On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote: purposes of this discussion.
there is no tool to repair a broken filesystem. There is a btrfsck tool. OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best. That's why we make the effort of marking some features as immature. The core file system is stable. It's the additional features that need some testing time. They may work fine. We just haven't invested the time to determine which other features are ready and don't want to represent
On 9/3/13 4:56 PM, Ken Schneider - openSUSE wrote: them as such.
Well, btrfs didn't work for me when trying to configure a single partition of about 18-TB on 12.3. The filesystem could be created, but would crash half way through a "fill-em-up" timing test. I'll try the test again with factory if I can within the next few days. Do you have a bugzilla ID for this?
828229. I filed it on July 12 2013, but haven't heard a thing since then. Regards, Lew -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 8:25 PM, Lew Wolfgang wrote:
On 09/03/2013 04:59 PM, Jeff Mahoney wrote:
On 9/3/13 7:57 PM, Lew Wolfgang wrote:
On 09/03/2013 02:02 PM, Jeff Mahoney wrote:
On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote:
On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote: > On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and > wrote: >> Hi all - >> >> Last month I posted queries to this list (and several other >> locations, >> including the forums) asking about people's experiences with >> btrfs. For >> the most part it seemed like the experience had improved over time. >> Most >> of the concerns were either with interactions with zypper or old >> perceptions of instability that were based more on old impressions >> than >> new testing. With the exception of an ENOSPC issue that had been >> recently fixed, users actively using the file system seemed pretty >> satisfied with it. >> >> I posted a followup question a week or two later asking what people >> thought about limiting the 'supported' feature set in the way we >> do in >> SLES so that it's clear to all users which parts of the file >> system are >> considered stable. >> >> A quick table of what that looks like: >> >> Supported Unsupported >> --------- ----------- >> Snapshots Inode cache >> Copy-on-Write Auto Defrag >> Subvolumes RAID >> Metadata Integrity Compression >> Data Integrity Send / Receive >> Online metadata scrubbing Hot add/remove >> Manual defrag Seeding devices >> Manual deduplication (soon) Multiple devices >> "Big" Metadata (supported >> read-only) >> >> Over time this table will change. Items from the Unsupported list >> will >> move to the Supported list as they mature. >> >> That proposal was pretty well received except, predictably, by >> those >> using the features listed. In practice, all that's required for >> those >> users to continue uninterrupted is to add the 'allow_unsupported=1' >> option to the btrfs module either on the kernel command line or >> /etc/modprobe.d. There is nothing inherently limiting to any >> openSUSE >> user with this practice. The features are all still in the code and >> available immediately just by setting a flag. It can even be done >> safely >> after module load or even after file systems that don't use the >> unsupported features have been mounted. I intend to introduce this >> functionality into openSUSE soon. >> >> One other aspect to consider: Even though they are independent >> projects, >> we've been focusing heavily on btrfs support in the SLES product. >> As a >> result, the openSUSE kernel will end up getting much of that work >> 'for >> free' since most of the same people maintain the kernel for both >> projects. >> >> So that's the "why it's safe" part of the proposal. I haven't >> gotten to >> the "why" yet, but then you probably already know the "whys". >> Subvolumes. Built-in snapshots that don't corrupt themselves >> when an >> exception table runs out of space. Built-in integrity verification >> via >> checksums. Built-in proactive metadata semantic checking via >> scrubbing. >> Online defrag. Soon we'll see online deduplication of arbitrary >> combinations of files. The code is written, it just needs to be >> pulled >> in. You've seen the rest of the feature set. Once we test more >> of it >> under load and ensure that it's mature enough to roll out, >> you'll get >> those features for free. >> >> So, I'd like to propose that we use btrfs as the default file >> system >> for >> the 13.1 release before we release the first beta. >> >> Thanks for your time. >> >> -Jeff >> >> > Not as long as any items are in the unsupported colume and as > long as The unsupported features might as well be "unimplemented" for the purposes of this discussion.
> there is no tool to repair a broken filesystem. There is a btrfsck tool. OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best. That's why we make the effort of marking some features as immature. The core file system is stable. It's the additional features that need some testing time. They may work fine. We just haven't invested the time to determine which other features are ready and don't want to represent
On 9/3/13 4:56 PM, Ken Schneider - openSUSE wrote: them as such.
Well, btrfs didn't work for me when trying to configure a single partition of about 18-TB on 12.3. The filesystem could be created, but would crash half way through a "fill-em-up" timing test. I'll try the test again with factory if I can within the next few days. Do you have a bugzilla ID for this?
828229. I filed it on July 12 2013, but haven't heard a thing since then.
Ok, that's a spurious ENOSPC. That's an area where there have been many fixes in the past year. I don't suppose you still have this array available to retest with 3.11? -Jeff -- Jeff Mahoney SUSE Labs
On 09/04/2013 08:05 AM, Jeff Mahoney wrote:
On 9/3/13 8:25 PM, Lew Wolfgang wrote:
On 9/3/13 7:57 PM, Lew Wolfgang wrote:
On 09/03/2013 02:02 PM, Jeff Mahoney wrote:
On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote: > On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote: >> On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and >> wrote: >>> Hi all - >>> >>> Last month I posted queries to this list (and several other >>> locations, >>> including the forums) asking about people's experiences with >>> btrfs. For >>> the most part it seemed like the experience had improved over time. >>> Most >>> of the concerns were either with interactions with zypper or old >>> perceptions of instability that were based more on old impressions >>> than >>> new testing. With the exception of an ENOSPC issue that had been >>> recently fixed, users actively using the file system seemed pretty >>> satisfied with it. >>> >>> I posted a followup question a week or two later asking what people >>> thought about limiting the 'supported' feature set in the way we >>> do in >>> SLES so that it's clear to all users which parts of the file >>> system are >>> considered stable. >>> >>> A quick table of what that looks like: >>> >>> Supported Unsupported >>> --------- ----------- >>> Snapshots Inode cache >>> Copy-on-Write Auto Defrag >>> Subvolumes RAID >>> Metadata Integrity Compression >>> Data Integrity Send / Receive >>> Online metadata scrubbing Hot add/remove >>> Manual defrag Seeding devices >>> Manual deduplication (soon) Multiple devices >>> "Big" Metadata (supported >>> read-only) >>> >>> Over time this table will change. Items from the Unsupported list >>> will >>> move to the Supported list as they mature. >>> >>> That proposal was pretty well received except, predictably, by >>> those >>> using the features listed. In practice, all that's required for >>> those >>> users to continue uninterrupted is to add the 'allow_unsupported=1' >>> option to the btrfs module either on the kernel command line or >>> /etc/modprobe.d. There is nothing inherently limiting to any >>> openSUSE >>> user with this practice. The features are all still in the code and >>> available immediately just by setting a flag. It can even be done >>> safely >>> after module load or even after file systems that don't use the >>> unsupported features have been mounted. I intend to introduce this >>> functionality into openSUSE soon. >>> >>> One other aspect to consider: Even though they are independent >>> projects, >>> we've been focusing heavily on btrfs support in the SLES product. >>> As a >>> result, the openSUSE kernel will end up getting much of that work >>> 'for >>> free' since most of the same people maintain the kernel for both >>> projects. >>> >>> So that's the "why it's safe" part of the proposal. I haven't >>> gotten to >>> the "why" yet, but then you probably already know the "whys". >>> Subvolumes. Built-in snapshots that don't corrupt themselves >>> when an >>> exception table runs out of space. Built-in integrity verification >>> via >>> checksums. Built-in proactive metadata semantic checking via >>> scrubbing. >>> Online defrag. Soon we'll see online deduplication of arbitrary >>> combinations of files. The code is written, it just needs to be >>> pulled >>> in. You've seen the rest of the feature set. Once we test more >>> of it >>> under load and ensure that it's mature enough to roll out, >>> you'll get >>> those features for free. >>> >>> So, I'd like to propose that we use btrfs as the default file >>> system >>> for >>> the 13.1 release before we release the first beta. >>> >>> Thanks for your time. >>> >>> -Jeff >>> >>> >> Not as long as any items are in the unsupported colume and as >> long as > The unsupported features might as well be "unimplemented" for the > purposes of this discussion. > >> there is no tool to repair a broken filesystem. > There is a btrfsck tool. OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best. That's why we make the effort of marking some features as immature. The core file system is stable. It's the additional features that need some testing time. They may work fine. We just haven't invested the time to determine which other features are ready and don't want to represent
On 9/3/13 4:56 PM, Ken Schneider - openSUSE wrote: them as such.
Well, btrfs didn't work for me when trying to configure a single partition of about 18-TB on 12.3. The filesystem could be created, but would crash half way through a "fill-em-up" timing test. I'll try the test again with factory if I can within the next few days. Do you have a bugzilla ID for this?
On 09/03/2013 04:59 PM, Jeff Mahoney wrote: 828229. I filed it on July 12 2013, but haven't heard a thing since then. Ok, that's a spurious ENOSPC. That's an area where there have been many fixes in the past year. I don't suppose you still have this array available to retest with 3.11?
No, the original system is in production. But five more just came in that I can play with. I hope to get some time with them within a week or so. Regards, Lew -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----Original Message----- From: Ken Schneider - openSUSE <suse-list3@bout-tyme.net> To: opensuse-factory@opensuse.org Subject: Re: [opensuse-factory] BtrFS as default fs? Date: Tue, 03 Sep 2013 16:56:38 -0400 On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote: OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best. -----Original Message----- Strange, with systemd we heard quite quite some other opinions. With btrfs and grub2 (which both i do favor and use for quite some time now) you still have alternatives...... hw -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Ken Schneider - openSUSE wrote:
On 09/03/2013 03:04 PM, Jeff Mahoney pecked at the keyboard and wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
OK, I was not aware of this. I just hate to see an "experimental" filesystem made the default. I also don't want to see the debacle that we saw with making KDE4 the default when it was clearly alpha at best. ==== Making a file system that is "experimental" be the defaults seems incredibly risky. Has anyone done any file system benchmarks comparing it against other file systems (I haven't seen any).
Going with a file system that hasn't been been tuned, or hasn't been compared against existing FS's, seems to be a serious case of leaping before looking. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 3, 2013 at 2:04 PM, Jeff Mahoney <jeffm@suse.com> wrote:
On 9/3/13 2:54 PM, Ken Schneider - openSUSE wrote:
there is no tool to repair a broken filesystem.
There is a btrfsck tool. Have you encountered a file system it was unable to repair? Bugzilla IDs? The tool can only improve with the reporting of different types of corruption. Even e2fsck still receives regular updates.
I have encountered at least *three* filesystems (on three different machines) that btrfsck was unable to repair or made worse. The btrfs mailing list states very explicitly that if you aren't running the latest btrfsck you risk problems. I filed a number of bugs (in the openSUSE bugzilla and in the kernel bugzilla), none of which have been fixed. Last time I checked there were _pages_ of unfixed btrfs bug reports, and at least as of 3.10 I have a filesystem that is still unrepairable. The btrfs folks have been unable to repair it. Making btrfs the default might work well (in so far as lots and lots of people will find what issues remain) but I can't imagine that it's even remotely close to ext4 in terms of general stability - yet. I must also chime in to suggest that the 'allow_unsupported' approach is rather flawed. At the very most, I might suggest that using a given 'immature' feature should result in a warning (as has been suggested here). I would not ever condone outright function removal, especially since the stock kernel has no such limitations. Please do not take my comments here to mean that I don't immensely appreciate the work done by you and many others, but I believe it is not unreasonable to say that we probably disagree as to whether or not btrfs is stable enough to be made the default filesystem. -- Jon -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 3, 2013 at 7:41 PM, Jon Nelson <jnelson-suse@jamponi.net> wrote:
Please do not take my comments here to mean that I don't immensely appreciate the work done by you and many others, but I believe it is not unreasonable to say that we probably disagree as to whether or not btrfs is stable enough to be made the default filesystem.
Also, consider the target audience of default filesystems. These aren't for the power users, which will consciously pick the filesystem they like best. Default filesystems are for the granny and the newbie, those that cannot and will not fix low-level filesystem issues. I haven't used BtrFS recently, but unless it's granny-proof, I'd think twice before making it default. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, 03 Sep 2013 19:55:49 -0300, Claudio Freire wrote:
Also, consider the target audience of default filesystems.
These aren't for the power users, which will consciously pick the filesystem they like best. Default filesystems are for the granny and the newbie, those that cannot and will not fix low-level filesystem issues.
I haven't used BtrFS recently, but unless it's granny-proof, I'd think twice before making it default.
+1. Changing the default to btrfs is going to increase the number of people having problems posting in the forums. It still seems to be considered "unstable" or "experimental", and if so, shouldn't be selected as the default. Jim -- Jim Henderson Please keep on-topic replies on the list so everyone benefits -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 7:00 PM, Jim Henderson wrote:
On Tue, 03 Sep 2013 19:55:49 -0300, Claudio Freire wrote:
Also, consider the target audience of default filesystems.
These aren't for the power users, which will consciously pick the filesystem they like best. Default filesystems are for the granny and the newbie, those that cannot and will not fix low-level filesystem issues.
I haven't used BtrFS recently, but unless it's granny-proof, I'd think twice before making it default.
+1. Changing the default to btrfs is going to increase the number of people having problems posting in the forums. It still seems to be considered "unstable" or "experimental", and if so, shouldn't be selected as the default.
Well that's the main thrust behind the "allow unsupported" module option. We have the feature set that we've evaluated to be mature and that's what we allow by default. When I cast a wide net across forums and mailing lists last month asking for user experiences, I got a lot of uninformed opinion and very little concrete data. Most of the negative data was in the area of snapper being too aggressive in creating snapshots and not aggressive enough in cleaning them up. There was some negative opinion WRT the file system itself, but most of it was in the realm of "I heard..." or "I don't trust it" based on too much hearsay and too little experience. It's that kind of rumor-response that is unhelpful in making decisions or improving the pain points with the file system. There were a few reports of people having troubles with the file system itself, but they tended to be with compression or RAID enabled -- the features that we don't entirely trust yet and want to disable so the casual user doesn't become an unwitting beta tester. So whether it's "considered" unstable or experimental largely depends on what features are being tested and who's doing the testing. A lot of times it involves armchair punditry and no testing at all. -Jeff -- Jeff Mahoney SUSE Labs
On Tue, 03 Sep 2013 19:59:06 -0400, Jeff Mahoney wrote:
Well that's the main thrust behind the "allow unsupported" module option. We have the feature set that we've evaluated to be mature and that's what we allow by default.
That seems a little counterintuitive to me. Allowing unsupported features would seem to indicate those features are immature, rather than mature. Am I missing something?
When I cast a wide net across forums and mailing lists last month asking for user experiences, I got a lot of uninformed opinion and very little concrete data.
Concrete data might be hard to come by, but I don't know that the way to get it is to risk new users' data (I assume upgrades wouldn't be affected - but is that a safe assumption?) to gather it. It needs to be an opt- in, not a default that's set that may result in users losing data.
Most of the negative data was in the area of snapper being too aggressive in creating snapshots and not aggressive enough in cleaning them up. There was some negative opinion WRT the file system itself, but most of it was in the realm of "I heard..." or "I don't trust it" based on too much hearsay and too little experience. It's that kind of rumor-response that is unhelpful in making decisions or improving the pain points with the file system. There were a few reports of people having troubles with the file system itself, but they tended to be with compression or RAID enabled -- the features that we don't entirely trust yet and want to disable so the casual user doesn't become an unwitting beta tester.
So whether it's "considered" unstable or experimental largely depends on what features are being tested and who's doing the testing. A lot of times it involves armchair punditry and no testing at all.
So for users to accept that their data is safe (or at least no less safe than it is with current - more mature - filesystems like ext4), don't set the default, but sell us on the idea. Tell us more about how the filesystem has improved, what the current outstanding issues are, and how they're being addressed. A lot of individuals aren't willing to test an unproven filesystem because of the risk to their data, or end up in a situation where the system has to be reinstalled. Myself, my openSUSE systems are my production work environment - so I need to be confident that I'm not going to lose critical data (which yes, I do back up the most critical data) and I'm not going to lose billable hours having to rebuild a system because the filesystem became inconsistent. I can certainly put it in a VM, but it's not going to get a thorough "real-world" workout there. OSS is all about transparency, so let's hear a little more about how btrfs has improved in the past 12-18 months. Jim -- Jim Henderson Please keep on-topic replies on the list so everyone benefits -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/3/13 8:36 PM, Jim Henderson wrote:
On Tue, 03 Sep 2013 19:59:06 -0400, Jeff Mahoney wrote:
Well that's the main thrust behind the "allow unsupported" module option. We have the feature set that we've evaluated to be mature and that's what we allow by default.
That seems a little counterintuitive to me. Allowing unsupported features would seem to indicate those features are immature, rather than mature. Am I missing something?
Yes, I'm proposing the opposite of that. The "allow unsupported" option would be disabled by default and is the "guard" in front of those immature features.
When I cast a wide net across forums and mailing lists last month asking for user experiences, I got a lot of uninformed opinion and very little concrete data.
Concrete data might be hard to come by, but I don't know that the way to get it is to risk new users' data (I assume upgrades wouldn't be affected - but is that a safe assumption?) to gather it. It needs to be an opt- in, not a default that's set that may result in users losing data.
My point was more that, for the most part, those users who've actually been using btrfs weren't the ones chiming in and claiming that it's not trustworthy yet. It's the ones who're nervous about trying it at all and are being overly conservative to the point of derailing the conversation without data to back their apprehension.
Most of the negative data was in the area of snapper being too aggressive in creating snapshots and not aggressive enough in cleaning them up. There was some negative opinion WRT the file system itself, but most of it was in the realm of "I heard..." or "I don't trust it" based on too much hearsay and too little experience. It's that kind of rumor-response that is unhelpful in making decisions or improving the pain points with the file system. There were a few reports of people having troubles with the file system itself, but they tended to be with compression or RAID enabled -- the features that we don't entirely trust yet and want to disable so the casual user doesn't become an unwitting beta tester.
So whether it's "considered" unstable or experimental largely depends on what features are being tested and who's doing the testing. A lot of times it involves armchair punditry and no testing at all.
So for users to accept that their data is safe (or at least no less safe than it is with current - more mature - filesystems like ext4), don't set the default, but sell us on the idea. Tell us more about how the filesystem has improved, what the current outstanding issues are, and how they're being addressed.
A lot of individuals aren't willing to test an unproven filesystem because of the risk to their data, or end up in a situation where the system has to be reinstalled. Myself, my openSUSE systems are my production work environment - so I need to be confident that I'm not going to lose critical data (which yes, I do back up the most critical data) and I'm not going to lose billable hours having to rebuild a system because the filesystem became inconsistent. I can certainly put it in a VM, but it's not going to get a thorough "real-world" workout there.
OSS is all about transparency, so let's hear a little more about how btrfs has improved in the past 12-18 months.
Absolutely. That's a completely reasonable request. The areas in which btrfs has historically been weak include: - error handling - This is an ongoing effort, mostly completed in v3.4, where certain classes of errors are handled by taking the file system "offline" (read-only). - ENOSPC - Case 1: incorrect calculation of reservation size - This is the case which Lew Wolfgang encountered in the bug report he mentioned. These cases /should/ be mostly fixed. I haven't seen one in a while. Where "a while" is defined in terms of kernel release versions, not linear clock time. - Case 2: unable to free space on a full file system - This is probably the most infuriating case. The gist is that in a CoW file system, blocks may need to be allocated in order to free other blocks. If we get into a pathological situation where all of the blocks are in use, then we essentially encounter a deadlocked file system where the shared resource is the free block count. This has been fixed with the introduction of a reserved metadata block pool that can only be used for removal operations when ENOSPC has already been encountered. I fixed a case of this last month so that subvolume removal should succeed when the file system is full. I believe these to be pretty much eliminated now. If I'm wrong about that, the good news is that since we already have the reserved metadata block pool implemented, the fix is only about 5 lines as a fallback case for an ENOSPC handler. - btrfsck - In truth, not as powerful as it should be yet. - Realistically, there's only so much that can be done without encountering novel broken file systems and collecting metadata images for analysis. There's a chicken/egg problem here in that there's no way to create a fsck tool that is prepared to encounter actual failures cases without seeing the failure cases. Sure, we can write up a tool that can predict certain cases, but there will *always* be surprises. e2fsck is still evolving as well. - The difference between btrfs and ext[234] is that we can 'scrub' the file system online to detect errors before they're actually encountered. - VM image performance - Performance is generally regarded as horrible. - This is because CoW on what is essentially a block device backing store means a ton of write amplification for each write that the VM issues. - The file system supports a 'nodatacow' file attribute: chattr -C <file>. This attribute changes the CoW behavior of file writes such that the write only causes a CoW to be performed only if there is more than one reference held on the data extent. - Caveat: There is currently a strange corner case where nodatacow prevents a reflink copy but allows a snapshot of the subvolume to make a snapshot of the file. (They're essentially the same thing on the back end.) - Solution(s): 1) Remove the distinction between the reflink copy and the snapshot cloning. 2) Always handle CoW on data extents as overwrites when there is only a single reference on the extent. - 1) is probably a bug fix, while 2) may meet with resistance within the file system development community since the CoW behavior also ensures that no parts of the data are overwritten and we already have a way to do this with chattr -C. New features: - Deduplication - SUSE's Mark Fasheh has added an extension to the clone ioctl that allows us to do an 'offline' (read: out-of-band) deduplication of data extents. - "Offline" doesn't mean unmounted - it means that the user makes use of an external tool that implements the deduplication policy. - Not "perfect" dedupe like "online" (in-band) implementations, but without the I/O amplification behavior that online dedupe has. I'm happy to discuss this further if there's interest in hearing more. - Removal of the strange per-directory hard link limit - Due to the backreferences to a single inode needing to fit in a single file system block, there was a limit to the number of hard links in a single directory. It could be quite low. - Limit removed by adding a new extended inode ref item, not enabled by default yet since it's a disk format change. Extended inode ref only used when required since it's not as space-efficient as the single node item. There's probably room for discussion within the file system community on whether we'd want to add an "ok to change" bit so that file systems have the ability to use the new extended inode ref items when needed but doesn't set the incompat bit until they're actually used. The other side of that coin is that it may not be clear to users when/if their file system has become incompatible with older kernels. - In-place conversion of reiserfs filesystems - Similar to the ext[234] converter - converts reiserfs filesystems to btrfs using the free space in the reiserfs filesystem. - See home:jeff_mahoney:convert for code and packages (still beta). Areas that still need work: - Error handling - Not in the handling failure cases sense, but in the fsfuzzer sense. - btrfsck - As I mentioned, we need broken file systems to fix in order to improve the tool. - General performance - For a root file system with general user activity, it performs reasonably well. I've asked one of my team to come up with solid performance numbers so that we can 1) demonstrate where the file system is performing relative to the usual suspects, and 2) identify where we need to focus our efforts. - Historically, fsync() was a problem spot but that's been mitigated with the introduction of a "tree log" that is similar to a journal but is really just used to accelerate fsync. FWIW, I've had a 4 TB btrfs file system with multiple subvolumes running for several years now. It sits on top of a 3 disk MDRAID5 volume. I've never encountered a file system corruption issue with it, though I have seen a few crashes. The last one was well over a year ago. There have been a few power outages as well without ill result. This is my "production" file system as far as that goes for me. It's not high throughput but it does serve as my "everything" volume, serving the local copies of my git trees for work, music, videos, hosts time machine backups for my wife, etc. I know people don't really want to compare SLES with openSUSE, but here's a case in which the story matters. We've been offering official support for btrfs since SLE11 SP2. SP3 was released a few months ago. Many people thought we were insane to do so because OMG BTRFS IS STILL EXPERIMENTAL, but we've crafted a file system implementation that *is* supportable. Between limiting the feature set for which we offer support and our kernel teams aggressively identifying and backporting fixes that may not have been pulled into the mainline kernel yet (more a factor of the maintainers being busy than the patches not being fully baked), we've created a pretty solid file system implementation. _THAT_ is why I do things like suggest that we have a similar "supported feature set" for openSUSE. It's not about limiting choice, though I suppose that's a side effect. It's about making it clear which parts of the file system are mature enough to be trusted and not just assuming that paid enterprise users are the only ones who care about things like that. Someone asked in another thread about who gets to determine whether or not a feature is mature enough. I'll be honest here. I lead the SUSE Labs Storage and File Systems team. We do, with significant informed feedback from our users. We perform testing on the the file system and cooperate with our QA staff to perform more testing on the file system. If we encounter bugs, we fix them. If we perform a ton of testing and don't encounter bugs, we start to lean to the "mature" phase. We depend on experienced users who don't mind playing with untested tech to file bug reports. If being in a Linux support environment for the past 14 years has taught me anything it's that users of software are much more creative about breaking things than the developers are. [This was also true when I was on the other side of the fence in my previous life as a big-UNIX sysadmin. Real support response when encountering a pretty funny error code on a large disk array's RAID controller: "You saw what? You should never see that."] All that said, it's possible that some of the things listed in the "unsupported" list work fine and could be considered mature already. That's a matter of testing and confirming that they are and there are only so many hours in the day to do it. That's also why the guard against unsupported features can be lifted by the user pretty easily. To be fair, though, most of this conversation from my perspective is about the stability of the file system itself and that's not the whole experience. We still need to focus on things like whether or not snapper is too aggressive in saving snapshots, and I believe there's been work in that area recently in response to user complaints. It's about the whole picture and I don't think two months to release is too late to have that conversation. -Jeff -- Jeff Mahoney SUSE Labs
On Wed, 04 Sep 2013 12:04:58 -0400, Jeff Mahoney wrote:
On 9/3/13 8:36 PM, Jim Henderson wrote:
On Tue, 03 Sep 2013 19:59:06 -0400, Jeff Mahoney wrote:
Well that's the main thrust behind the "allow unsupported" module option. We have the feature set that we've evaluated to be mature and that's what we allow by default.
That seems a little counterintuitive to me. Allowing unsupported features would seem to indicate those features are immature, rather than mature. Am I missing something?
Yes, I'm proposing the opposite of that. The "allow unsupported" option would be disabled by default and is the "guard" in front of those immature features.
OK, that makes sense, thanks for the clarification.
My point was more that, for the most part, those users who've actually been using btrfs weren't the ones chiming in and claiming that it's not trustworthy yet. It's the ones who're nervous about trying it at all and are being overly conservative to the point of derailing the conversation without data to back their apprehension.
It's difficult to have hard data when you're apprehensive about running it because you've heard that there are still problems with it, or that there's a lack of effective/mature tools for fixing problems. Even bearing in mind that support venues don't tend to get people saying "everything's just fine, no problems here at all". I don't think it's "overly conservative" though to be cautious about risking your data. I think it's up to those who believe it's stable to demonstrate that it is, and to assure the users that this is a safe filesystem to use. If it is, then of course, I'd want to use it. But I don't want to take a bigger risk so there can be more data gathered about unrecoverable problems. Does that make sense?
OSS is all about transparency, so let's hear a little more about how btrfs has improved in the past 12-18 months.
Absolutely. That's a completely reasonable request.
Thank you. :)
The areas in which btrfs has historically been weak include:
- error handling - This is an ongoing effort, mostly completed in v3.4, where certain classes of errors are handled by taking the file system "offline" (read-only).
What are the next steps after taking the filesystem offline in this instance?
- ENOSPC - Case 1: incorrect calculation of reservation size - This is the case which Lew Wolfgang encountered in the bug report he mentioned. These cases /should/ be mostly fixed. I haven't seen one in a while. Where "a while" is defined in terms of kernel release versions, not linear clock time. - Case 2: unable to free space on a full file system - This is probably the most infuriating case. The gist is that in a CoW file system, blocks may need to be allocated in order to free other blocks. If we get into a pathological situation where all of the blocks are in use, then we essentially encounter a deadlocked file system where the shared resource is the free block count. This has been fixed with the introduction of a reserved metadata block pool that can only be used for removal operations when ENOSPC has already been encountered. I fixed a case of this last month so that subvolume removal should succeed when the file system is full. I believe these to be pretty much eliminated now. If I'm wrong about that, the good news is that since we already have the reserved metadata block pool implemented, the fix is only about 5 lines as a fallback case for an ENOSPC handler.
Sounds like some good progress here.
- btrfsck - In truth, not as powerful as it should be yet. - Realistically, there's only so much that can be done without encountering novel broken file systems and collecting metadata images for analysis. There's a chicken/egg problem here in that there's no way to create a fsck tool that is prepared to encounter actual failures cases without seeing the failure cases. Sure, we can write up a tool that can predict certain cases, but there will *always* be surprises. e2fsck is still evolving as well. - The difference between btrfs and ext[234] is that we can 'scrub' the file system online to detect errors before they're actually encountered.
Sounds like some room for work here, but I understand what you're saying about predicting the unpredictable, too. Since this is a SUSE effort, have you looked at how you might test this in, say, superlab in Provo? Getting time on the schedule might be difficult (I don't know how busy they are these days), but if you could push out an image to 100-200 machines and run an automated test suite to read/write data and try to stress the filesystems on a relatively large number of machines, that might give you some testing that doesn't involve risking real users' data.
From where I sit, the expectation isn't to eliminate the possibility of errors - but to look for something that's maybe one step better than "good enough".
The scrubbing capability sounds interesting.
- VM image performance - Performance is generally regarded as horrible. - This is because CoW on what is essentially a block device backing store means a ton of write amplification for each write that the VM issues. - The file system supports a 'nodatacow' file attribute: chattr -C <file>. This attribute changes the CoW behavior of file writes such that the write only causes a CoW to be performed only if there is more than one reference held on the data extent. - Caveat: There is currently a strange corner case where nodatacow prevents a reflink copy but allows a snapshot of the subvolume to make a snapshot of the file. (They're essentially the same thing on the back end.) - Solution(s): 1) Remove the distinction between the reflink copy and the snapshot cloning. 2) Always handle CoW on data extents as overwrites when there is only a single reference on the extent. - 1) is probably a bug fix, while 2) may meet with resistance within the file system development community since the CoW behavior also ensures that no parts of the data are overwritten and we already have a way to do this with chattr -C.
Would it make a difference if one used a preallocated disk image rather than a dynamic image?
New features: - Deduplication - SUSE's Mark Fasheh has added an extension to the clone ioctl that allows us to do an 'offline' (read: out-of-band) deduplication of data extents. - "Offline" doesn't mean unmounted - it means that the user makes use of an external tool that implements the deduplication policy. - Not "perfect" dedupe like "online" (in-band) implementations, but without the I/O amplification behavior that online dedupe has. I'm happy to discuss this further if there's interest in hearing more.
That sounds like a cool feature. Has anyone played with this on, say, truecrypt encrypted devices as yet? (I have a very large truecrypt encrypted volume that I know has some duplication of data on it, and scripting to remove the duplicates, while not difficult, is something I haven't taken the time to do yet.
- Removal of the strange per-directory hard link limit - Due to the backreferences to a single inode needing to fit in a single file system block, there was a limit to the number of hard links in a single directory. It could be quite low. - Limit removed by adding a new extended inode ref item, not enabled by default yet since it's a disk format change. Extended inode ref only used when required since it's not as space-efficient as the single node item. There's probably room for discussion within the file system community on whether we'd want to add an "ok to change" bit so that file systems have the ability to use the new extended inode ref items when needed but doesn't set the incompat bit until they're actually used. The other side of that coin is that it may not be clear to users when/if their file system has become incompatible with older kernels.
Most of that is over my head - what's the bottom line/impact on this?
- In-place conversion of reiserfs filesystems - Similar to the ext[234] converter - converts reiserfs filesystems to btrfs using the free space in the reiserfs filesystem. - See home:jeff_mahoney:convert for code and packages (still beta).
Nice. I can see that being useful for those who have upgraded through several versions.
Areas that still need work: - Error handling - Not in the handling failure cases sense, but in the fsfuzzer sense.
- btrfsck - As I mentioned, we need broken file systems to fix in order to improve the tool.
- General performance - For a root file system with general user activity, it performs reasonably well. I've asked one of my team to come up with solid performance numbers so that we can 1) demonstrate where the file system is performing relative to the usual suspects, and 2) identify where we need to focus our efforts. - Historically, fsync() was a problem spot but that's been mitigated with the introduction of a "tree log" that is similar to a journal but is really just used to accelerate fsync.
Some general performance numbers would be good to see - as well as performance on large files/small files.
FWIW, I've had a 4 TB btrfs file system with multiple subvolumes running for several years now. It sits on top of a 3 disk MDRAID5 volume. I've never encountered a file system corruption issue with it, though I have seen a few crashes. The last one was well over a year ago. There have been a few power outages as well without ill result. This is my "production" file system as far as that goes for me. It's not high throughput but it does serve as my "everything" volume, serving the local copies of my git trees for work, music, videos, hosts time machine backups for my wife, etc.
It's good to hear success stories. I'm curious - do you back this up, or is most of the data available elsewhere in the event of an unrecoverable issue? (Of course, "unrecoverable" for you is probably different than it would be for me, since you know the filesystem well enough to manually work on it if necessary).
I know people don't really want to compare SLES with openSUSE, but here's a case in which the story matters. We've been offering official support for btrfs since SLE11 SP2. SP3 was released a few months ago. Many people thought we were insane to do so because OMG BTRFS IS STILL EXPERIMENTAL, but we've crafted a file system implementation that *is* supportable. Between limiting the feature set for which we offer support and our kernel teams aggressively identifying and backporting fixes that may not have been pulled into the mainline kernel yet (more a factor of the maintainers being busy than the patches not being fully baked), we've created a pretty solid file system implementation.
Given that the work for btrfs in SLE and openSUSE is being handled largely by the same people, I think it makes sense to make the comparison. SLE doesn't yet default to btrfs, though, does it?
_THAT_ is why I do things like suggest that we have a similar "supported feature set" for openSUSE. It's not about limiting choice, though I suppose that's a side effect. It's about making it clear which parts of the file system are mature enough to be trusted and not just assuming that paid enterprise users are the only ones who care about things like that.
That's sensible.
Someone asked in another thread about who gets to determine whether or not a feature is mature enough. I'll be honest here. I lead the SUSE Labs Storage and File Systems team. We do, with significant informed feedback from our users. We perform testing on the the file system and cooperate with our QA staff to perform more testing on the file system. If we encounter bugs, we fix them. If we perform a ton of testing and don't encounter bugs, we start to lean to the "mature" phase. We depend on experienced users who don't mind playing with untested tech to file bug reports. If being in a Linux support environment for the past 14 years has taught me anything it's that users of software are much more creative about breaking things than the developers are. [This was also true when I was on the other side of the fence in my previous life as a big-UNIX sysadmin. Real support response when encountering a pretty funny error code on a large disk array's RAID controller: "You saw what? You should never see that."]
Heh, yeah, I've had plenty of circumstances myself where I've seen weirdness never before encountered in the lab (from both sides of the conversation) - users do come up with very creative ways to break things.
All that said, it's possible that some of the things listed in the "unsupported" list work fine and could be considered mature already. That's a matter of testing and confirming that they are and there are only so many hours in the day to do it. That's also why the guard against unsupported features can be lifted by the user pretty easily.
To be fair, though, most of this conversation from my perspective is about the stability of the file system itself and that's not the whole experience. We still need to focus on things like whether or not snapper is too aggressive in saving snapshots, and I believe there's been work in that area recently in response to user complaints. It's about the whole picture and I don't think two months to release is too late to have that conversation.
It's good to have the conversation - thank you for the detailed explanation of things. That really helps put my mind at ease that this isn't (as we see from time to time) a "throw it over the wall and see what breaks" approach. It sounds like you've really done your homework and stand behind what you and your team have done to make btrfs production quality. While I still have reservations (and probably will until it reaches some sort of critical mass), my concerns are largely addressed. Jim -- Jim Henderson Please keep on-topic replies on the list so everyone benefits -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 09/04/2013 11:22 AM, Jim Henderson wrote:
On Wed, 04 Sep 2013 12:04:58 -0400, Jeff Mahoney wrote:
On 9/3/13 8:36 PM, Jim Henderson wrote:
On Tue, 03 Sep 2013 19:59:06 -0400, Jeff Mahoney wrote:
Well that's the main thrust behind the "allow unsupported" module option. We have the feature set that we've evaluated to be mature and that's what we allow by default.
That seems a little counterintuitive to me. Allowing unsupported features would seem to indicate those features are immature, rather than mature. Am I missing something?
Yes, I'm proposing the opposite of that. The "allow unsupported" option would be disabled by default and is the "guard" in front of those immature features.
OK, that makes sense, thanks for the clarification.
My point was more that, for the most part, those users who've actually been using btrfs weren't the ones chiming in and claiming that it's not trustworthy yet. It's the ones who're nervous about trying it at all and are being overly conservative to the point of derailing the conversation without data to back their apprehension.
It's difficult to have hard data when you're apprehensive about running it because you've heard that there are still problems with it, or that there's a lack of effective/mature tools for fixing problems.
Even bearing in mind that support venues don't tend to get people saying "everything's just fine, no problems here at all".
I don't think it's "overly conservative" though to be cautious about risking your data. I think it's up to those who believe it's stable to demonstrate that it is, and to assure the users that this is a safe filesystem to use.
If it is, then of course, I'd want to use it. But I don't want to take a bigger risk so there can be more data gathered about unrecoverable problems. Does that make sense?
OSS is all about transparency, so let's hear a little more about how btrfs has improved in the past 12-18 months.
Absolutely. That's a completely reasonable request.
Thank you. :)
The areas in which btrfs has historically been weak include:
- error handling - This is an ongoing effort, mostly completed in v3.4, where certain classes of errors are handled by taking the file system "offline" (read-only).
What are the next steps after taking the filesystem offline in this instance?
- ENOSPC - Case 1: incorrect calculation of reservation size - This is the case which Lew Wolfgang encountered in the bug report he mentioned. These cases /should/ be mostly fixed. I haven't seen one in a while. Where "a while" is defined in terms of kernel release versions, not linear clock time. - Case 2: unable to free space on a full file system - This is probably the most infuriating case. The gist is that in a CoW file system, blocks may need to be allocated in order to free other blocks. If we get into a pathological situation where all of the blocks are in use, then we essentially encounter a deadlocked file system where the shared resource is the free block count. This has been fixed with the introduction of a reserved metadata block pool that can only be used for removal operations when ENOSPC has already been encountered. I fixed a case of this last month so that subvolume removal should succeed when the file system is full. I believe these to be pretty much eliminated now. If I'm wrong about that, the good news is that since we already have the reserved metadata block pool implemented, the fix is only about 5 lines as a fallback case for an ENOSPC handler.
Sounds like some good progress here.
- btrfsck - In truth, not as powerful as it should be yet. - Realistically, there's only so much that can be done without encountering novel broken file systems and collecting metadata images for analysis. There's a chicken/egg problem here in that there's no way to create a fsck tool that is prepared to encounter actual failures cases without seeing the failure cases. Sure, we can write up a tool that can predict certain cases, but there will *always* be surprises. e2fsck is still evolving as well. - The difference between btrfs and ext[234] is that we can 'scrub' the file system online to detect errors before they're actually encountered.
Sounds like some room for work here, but I understand what you're saying about predicting the unpredictable, too. Since this is a SUSE effort, have you looked at how you might test this in, say, superlab in Provo? Getting time on the schedule might be difficult (I don't know how busy they are these days), but if you could push out an image to 100-200 machines and run an automated test suite to read/write data and try to stress the filesystems on a relatively large number of machines, that might give you some testing that doesn't involve risking real users' data.
From where I sit, the expectation isn't to eliminate the possibility of errors - but to look for something that's maybe one step better than "good enough".
The scrubbing capability sounds interesting.
- VM image performance - Performance is generally regarded as horrible. - This is because CoW on what is essentially a block device backing store means a ton of write amplification for each write that the VM issues. - The file system supports a 'nodatacow' file attribute: chattr -C <file>. This attribute changes the CoW behavior of file writes such that the write only causes a CoW to be performed only if there is more than one reference held on the data extent. - Caveat: There is currently a strange corner case where nodatacow prevents a reflink copy but allows a snapshot of the subvolume to make a snapshot of the file. (They're essentially the same thing on the back end.) - Solution(s): 1) Remove the distinction between the reflink copy and the snapshot cloning. 2) Always handle CoW on data extents as overwrites when there is only a single reference on the extent. - 1) is probably a bug fix, while 2) may meet with resistance within the file system development community since the CoW behavior also ensures that no parts of the data are overwritten and we already have a way to do this with chattr -C.
Would it make a difference if one used a preallocated disk image rather than a dynamic image?
New features: - Deduplication - SUSE's Mark Fasheh has added an extension to the clone ioctl that allows us to do an 'offline' (read: out-of-band) deduplication of data extents. - "Offline" doesn't mean unmounted - it means that the user makes use of an external tool that implements the deduplication policy. - Not "perfect" dedupe like "online" (in-band) implementations, but without the I/O amplification behavior that online dedupe has. I'm happy to discuss this further if there's interest in hearing more.
That sounds like a cool feature. Has anyone played with this on, say, truecrypt encrypted devices as yet? (I have a very large truecrypt encrypted volume that I know has some duplication of data on it, and scripting to remove the duplicates, while not difficult, is something I haven't taken the time to do yet.
- Removal of the strange per-directory hard link limit - Due to the backreferences to a single inode needing to fit in a single file system block, there was a limit to the number of hard links in a single directory. It could be quite low. - Limit removed by adding a new extended inode ref item, not enabled by default yet since it's a disk format change. Extended inode ref only used when required since it's not as space-efficient as the single node item. There's probably room for discussion within the file system community on whether we'd want to add an "ok to change" bit so that file systems have the ability to use the new extended inode ref items when needed but doesn't set the incompat bit until they're actually used. The other side of that coin is that it may not be clear to users when/if their file system has become incompatible with older kernels.
Most of that is over my head - what's the bottom line/impact on this?
- In-place conversion of reiserfs filesystems - Similar to the ext[234] converter - converts reiserfs filesystems to btrfs using the free space in the reiserfs filesystem. - See home:jeff_mahoney:convert for code and packages (still beta).
Nice. I can see that being useful for those who have upgraded through several versions.
Areas that still need work: - Error handling - Not in the handling failure cases sense, but in the fsfuzzer sense.
- btrfsck - As I mentioned, we need broken file systems to fix in order to improve the tool.
- General performance - For a root file system with general user activity, it performs reasonably well. I've asked one of my team to come up with solid performance numbers so that we can 1) demonstrate where the file system is performing relative to the usual suspects, and 2) identify where we need to focus our efforts. - Historically, fsync() was a problem spot but that's been mitigated with the introduction of a "tree log" that is similar to a journal but is really just used to accelerate fsync.
Some general performance numbers would be good to see - as well as performance on large files/small files.
FWIW, I've had a 4 TB btrfs file system with multiple subvolumes running for several years now. It sits on top of a 3 disk MDRAID5 volume. I've never encountered a file system corruption issue with it, though I have seen a few crashes. The last one was well over a year ago. There have been a few power outages as well without ill result. This is my "production" file system as far as that goes for me. It's not high throughput but it does serve as my "everything" volume, serving the local copies of my git trees for work, music, videos, hosts time machine backups for my wife, etc.
It's good to hear success stories. I'm curious - do you back this up, or is most of the data available elsewhere in the event of an unrecoverable issue? (Of course, "unrecoverable" for you is probably different than it would be for me, since you know the filesystem well enough to manually work on it if necessary).
I know people don't really want to compare SLES with openSUSE, but here's a case in which the story matters. We've been offering official support for btrfs since SLE11 SP2. SP3 was released a few months ago. Many people thought we were insane to do so because OMG BTRFS IS STILL EXPERIMENTAL, but we've crafted a file system implementation that *is* supportable. Between limiting the feature set for which we offer support and our kernel teams aggressively identifying and backporting fixes that may not have been pulled into the mainline kernel yet (more a factor of the maintainers being busy than the patches not being fully baked), we've created a pretty solid file system implementation.
Given that the work for btrfs in SLE and openSUSE is being handled largely by the same people, I think it makes sense to make the comparison.
SLE doesn't yet default to btrfs, though, does it?
_THAT_ is why I do things like suggest that we have a similar "supported feature set" for openSUSE. It's not about limiting choice, though I suppose that's a side effect. It's about making it clear which parts of the file system are mature enough to be trusted and not just assuming that paid enterprise users are the only ones who care about things like that.
That's sensible.
Someone asked in another thread about who gets to determine whether or not a feature is mature enough. I'll be honest here. I lead the SUSE Labs Storage and File Systems team. We do, with significant informed feedback from our users. We perform testing on the the file system and cooperate with our QA staff to perform more testing on the file system. If we encounter bugs, we fix them. If we perform a ton of testing and don't encounter bugs, we start to lean to the "mature" phase. We depend on experienced users who don't mind playing with untested tech to file bug reports. If being in a Linux support environment for the past 14 years has taught me anything it's that users of software are much more creative about breaking things than the developers are. [This was also true when I was on the other side of the fence in my previous life as a big-UNIX sysadmin. Real support response when encountering a pretty funny error code on a large disk array's RAID controller: "You saw what? You should never see that."]
Heh, yeah, I've had plenty of circumstances myself where I've seen weirdness never before encountered in the lab (from both sides of the conversation) - users do come up with very creative ways to break things.
All that said, it's possible that some of the things listed in the "unsupported" list work fine and could be considered mature already. That's a matter of testing and confirming that they are and there are only so many hours in the day to do it. That's also why the guard against unsupported features can be lifted by the user pretty easily.
To be fair, though, most of this conversation from my perspective is about the stability of the file system itself and that's not the whole experience. We still need to focus on things like whether or not snapper is too aggressive in saving snapshots, and I believe there's been work in that area recently in response to user complaints. It's about the whole picture and I don't think two months to release is too late to have that conversation.
It's good to have the conversation - thank you for the detailed explanation of things. That really helps put my mind at ease that this isn't (as we see from time to time) a "throw it over the wall and see what breaks" approach. It sounds like you've really done your homework and stand behind what you and your team have done to make btrfs production quality. While I still have reservations (and probably will until it reaches some sort of critical mass), my concerns are largely addressed.
Jim
Oddly enough, I just went through this with an aborted attempt to move to SLED. In the attempt, I found that my existing ext4 filesystems could not be mounted RW and I spent a day chasing down how to get them RW so a btrfs conversion could proceed. It's not the "change the default" that was so objectionable, it was the idea that my system was broken by someone decision to not give me the choice. This is the same problem that seems to percolating through certain other not to be named system decisions. Change the default... Fine. Allow me to choose what I will you or fall though to the new default. But DO NOT take away the choice to make use of something other than your choice. Forcing people to hunt for a solution ("let's make the other paths difficult and ours easy") doesn't count as allowing choice either. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/4/13 2:43 PM, Bruce Ferrell wrote:
Oddly enough, I just went through this with an aborted attempt to move to SLED.
In the attempt, I found that my existing ext4 filesystems could not be mounted RW and I spent a day chasing down how to get them RW so a btrfs conversion could proceed.
I'm not sure how these are connected. The in-place conversion doesn't need a read-write ext4 kernel implementation to complete.
It's not the "change the default" that was so objectionable, it was the idea that my system was broken by someone decision to not give me the choice.
A read-only ext4 for migration purposes is well documented for SLE11. Forcing it read-only was a reaction to users thinking that because they could mount it read-write, that we'd support it. Yes, it's annoying for users who want to do it anyway, but then you're willfully putting your system out of a supported state. With SP3, we've made it easier to ignore the support status of a read-write ext4 by adding a "rw=1" module option to the ext4 module that we ship with the official kernel release rather than having an unsupported ext4-writeable KMP. This documented in the release notes.
This is the same problem that seems to percolating through certain other not to be named system decisions.
Change the default... Fine. Allow me to choose what I will you or fall though to the new default. But DO NOT take away the choice to make use of something other than your choice. Forcing people to hunt for a solution ("let's make the other paths difficult and ours easy") doesn't count as allowing choice either.
There have been comments on this in other places in this thread. The goal is to make it optional in YaST and automatic if features are in use that require unsupported features. -Jeff -- Jeff Mahoney SUSE Labs
On 9/4/13 2:22 PM, Jim Henderson wrote:
On Wed, 04 Sep 2013 12:04:58 -0400, Jeff Mahoney wrote:
On Tue, 03 Sep 2013 19:59:06 -0400, Jeff Mahoney wrote: My point was more that, for the most part, those users who've actually been using btrfs weren't the ones chiming in and claiming that it's not
On 9/3/13 8:36 PM, Jim Henderson wrote: trustworthy yet. It's the ones who're nervous about trying it at all and are being overly conservative to the point of derailing the conversation without data to back their apprehension.
It's difficult to have hard data when you're apprehensive about running it because you've heard that there are still problems with it, or that there's a lack of effective/mature tools for fixing problems.
Even bearing in mind that support venues don't tend to get people saying "everything's just fine, no problems here at all".
I don't think it's "overly conservative" though to be cautious about risking your data. I think it's up to those who believe it's stable to demonstrate that it is, and to assure the users that this is a safe filesystem to use.
If it is, then of course, I'd want to use it. But I don't want to take a bigger risk so there can be more data gathered about unrecoverable problems. Does that make sense?
I agree that it's not overly conservative to want to preserve your data. That's the baseline of what you expect from a file system. What I do object to is people saying "no" without having an actual reason for saying so other than being worried about it. Worry is fine, but depending on worries as a data set is problematic.
The areas in which btrfs has historically been weak include:
- error handling - This is an ongoing effort, mostly completed in v3.4, where certain classes of errors are handled by taking the file system "offline" (read-only).
What are the next steps after taking the filesystem offline in this instance?
Like every file system, it depends on the error case. This is the same reaction to errors that xfs, ext3, ext4, reiserfs, etc have when they don't want to risk corrupting your data. If it's a disk failure or media access error, that needs to be corrected. If it's corruption, that also needs to be corrected. If it's an ENOMEM, that gets more difficult and that's an area in which I want to invest more effort into avoiding, but it's low priority right now. For the most part, though, the allocation requests are small. I've never seen an ENOMEM in the middle of btrfs actually happen. In most cases, if the error can be corrected without a reboot, a simple umount <fix> remount cycle will be enough.
- btrfsck - In truth, not as powerful as it should be yet. - Realistically, there's only so much that can be done without encountering novel broken file systems and collecting metadata images for analysis. There's a chicken/egg problem here in that there's no way to create a fsck tool that is prepared to encounter actual failures cases without seeing the failure cases. Sure, we can write up a tool that can predict certain cases, but there will *always* be surprises. e2fsck is still evolving as well. - The difference between btrfs and ext[234] is that we can 'scrub' the file system online to detect errors before they're actually encountered.
Sounds like some room for work here, but I understand what you're saying about predicting the unpredictable, too. Since this is a SUSE effort, have you looked at how you might test this in, say, superlab in Provo? Getting time on the schedule might be difficult (I don't know how busy they are these days), but if you could push out an image to 100-200 machines and run an automated test suite to read/write data and try to stress the filesystems on a relatively large number of machines, that might give you some testing that doesn't involve risking real users' data.
Yeah, we definitely plan on doing wider stress tests in the coming months.
From where I sit, the expectation isn't to eliminate the possibility of errors - but to look for something that's maybe one step better than "good enough".
Agreed. This is an area where we plan to invest more effort.
The scrubbing capability sounds interesting.
- VM image performance - Performance is generally regarded as horrible. - This is because CoW on what is essentially a block device backing store means a ton of write amplification for each write that the VM issues. - The file system supports a 'nodatacow' file attribute: chattr -C <file>. This attribute changes the CoW behavior of file writes such that the write only causes a CoW to be performed only if there is more than one reference held on the data extent. - Caveat: There is currently a strange corner case where nodatacow prevents a reflink copy but allows a snapshot of the subvolume to make a snapshot of the file. (They're essentially the same thing on the back end.) - Solution(s): 1) Remove the distinction between the reflink copy and the snapshot cloning. 2) Always handle CoW on data extents as overwrites when there is only a single reference on the extent. - 1) is probably a bug fix, while 2) may meet with resistance within the file system development community since the CoW behavior also ensures that no parts of the data are overwritten and we already have a way to do this with chattr -C.
Would it make a difference if one used a preallocated disk image rather than a dynamic image?
No. The issue isn't the initial allocation, it's the CoW for writes into the image file.
New features: - Deduplication - SUSE's Mark Fasheh has added an extension to the clone ioctl that allows us to do an 'offline' (read: out-of-band) deduplication of data extents. - "Offline" doesn't mean unmounted - it means that the user makes use of an external tool that implements the deduplication policy. - Not "perfect" dedupe like "online" (in-band) implementations, but without the I/O amplification behavior that online dedupe has. I'm happy to discuss this further if there's interest in hearing more.
That sounds like a cool feature. Has anyone played with this on, say, truecrypt encrypted devices as yet? (I have a very large truecrypt encrypted volume that I know has some duplication of data on it, and scripting to remove the duplicates, while not difficult, is something I haven't taken the time to do yet.
Not specifically, AFAIK.
- Removal of the strange per-directory hard link limit - Due to the backreferences to a single inode needing to fit in a single file system block, there was a limit to the number of hard links in a single directory. It could be quite low. - Limit removed by adding a new extended inode ref item, not enabled by default yet since it's a disk format change. Extended inode ref only used when required since it's not as space-efficient as the single node item. There's probably room for discussion within the file system community on whether we'd want to add an "ok to change" bit so that file systems have the ability to use the new extended inode ref items when needed but doesn't set the incompat bit until they're actually used. The other side of that coin is that it may not be clear to users when/if their file system has become incompatible with older kernels.
Most of that is over my head - what's the bottom line/impact on this?
The bottom line is that w/o this enabled, things that use a lot of hard links in a single directory can run into EMLINK. You can fix the issue by enabling the extended inode ref feature, but it means that you can mount the file system on older kernels. "Older" in this case means prior to 3.7 IIRC, so oS 12.3.
Areas that still need work: - Error handling - Not in the handling failure cases sense, but in the fsfuzzer sense.
- btrfsck - As I mentioned, we need broken file systems to fix in order to improve the tool.
- General performance - For a root file system with general user activity, it performs reasonably well. I've asked one of my team to come up with solid performance numbers so that we can 1) demonstrate where the file system is performing relative to the usual suspects, and 2) identify where we need to focus our efforts. - Historically, fsync() was a problem spot but that's been mitigated with the introduction of a "tree log" that is similar to a journal but is really just used to accelerate fsync.
Some general performance numbers would be good to see - as well as performance on large files/small files.
Once we have something we can publish, I'll be happy to share them. The baseline off-the-cuff performance shows that performance is similar to ext3 for some workloads, and way off in others, specifically those that are unlink-heavy.
FWIW, I've had a 4 TB btrfs file system with multiple subvolumes running for several years now. It sits on top of a 3 disk MDRAID5 volume. I've never encountered a file system corruption issue with it, though I have seen a few crashes. The last one was well over a year ago. There have been a few power outages as well without ill result. This is my "production" file system as far as that goes for me. It's not high throughput but it does serve as my "everything" volume, serving the local copies of my git trees for work, music, videos, hosts time machine backups for my wife, etc.
It's good to hear success stories. I'm curious - do you back this up, or is most of the data available elsewhere in the event of an unrecoverable issue? (Of course, "unrecoverable" for you is probably different than it would be for me, since you know the filesystem well enough to manually work on it if necessary).
I don't have backups for anything but time machine bundles, my mail mirror, and photos. The music and videos can be reproduced in a time-consuming manner. Mostly it's a matter of the price of backup space being more expensive than I want to spend. But, yeah, I do have the luxury of knowing where to start to fix it manually if I must.
I know people don't really want to compare SLES with openSUSE, but here's a case in which the story matters. We've been offering official support for btrfs since SLE11 SP2. SP3 was released a few months ago. Many people thought we were insane to do so because OMG BTRFS IS STILL EXPERIMENTAL, but we've crafted a file system implementation that *is* supportable. Between limiting the feature set for which we offer support and our kernel teams aggressively identifying and backporting fixes that may not have been pulled into the mainline kernel yet (more a factor of the maintainers being busy than the patches not being fully baked), we've created a pretty solid file system implementation.
Given that the work for btrfs in SLE and openSUSE is being handled largely by the same people, I think it makes sense to make the comparison.
SLE doesn't yet default to btrfs, though, does it?
SLE11 defaults to ext3 and we don't change the default in a service pack. I can't comment on what the default in SLE12 will be. I'll refer questions about that to our product manager for SLES, Matthias Eckermann. It should be apparent that SUSE is invested in the success of btrfs, though.
_THAT_ is why I do things like suggest that we have a similar "supported feature set" for openSUSE. It's not about limiting choice, though I suppose that's a side effect. It's about making it clear which parts of the file system are mature enough to be trusted and not just assuming that paid enterprise users are the only ones who care about things like that.
That's sensible.
All that said, it's possible that some of the things listed in the "unsupported" list work fine and could be considered mature already. That's a matter of testing and confirming that they are and there are only so many hours in the day to do it. That's also why the guard against unsupported features can be lifted by the user pretty easily.
To be fair, though, most of this conversation from my perspective is about the stability of the file system itself and that's not the whole experience. We still need to focus on things like whether or not snapper is too aggressive in saving snapshots, and I believe there's been work in that area recently in response to user complaints. It's about the whole picture and I don't think two months to release is too late to have that conversation.
It's good to have the conversation - thank you for the detailed explanation of things. That really helps put my mind at ease that this isn't (as we see from time to time) a "throw it over the wall and see what breaks" approach. It sounds like you've really done your homework and stand behind what you and your team have done to make btrfs production quality. While I still have reservations (and probably will until it reaches some sort of critical mass), my concerns are largely addressed.
Exactly. This is a file system which we've seen deployed with SLE11 SP2/3 and with which we've seen pretty good results. It's also something that we put significant effort into improving even after the initial release of the service pack. -Jeff -- Jeff Mahoney SUSE Labs
Hello all, On 2013-09-04 T 19:48 -0400 Jeff Mahoney wrote:
On 9/4/13 2:22 PM, Jim Henderson wrote:
On Wed, 04 Sep 2013 12:04:58 -0400, Jeff Mahoney wrote:
I know people don't really want to compare SLES with openSUSE, but here's a case in which the story matters. We've been offering official support for btrfs since SLE11 SP2. SP3 was released a few months ago. Many people thought we were insane to do so because OMG BTRFS IS STILL EXPERIMENTAL, but we've crafted a file system implementation that *is* supportable. Between limiting the feature set for which we offer support and our kernel teams aggressively identifying and backporting fixes that may not have been pulled into the mainline kernel yet (more a factor of the maintainers being busy than the patches not being fully baked), we've created a pretty solid file system implementation.
Given that the work for btrfs in SLE and openSUSE is being handled largely by the same people, I think it makes sense to make the comparison.
SLE doesn't yet default to btrfs, though, does it?
SLE11 defaults to ext3 and we don't change the default in a service pack. I can't comment on what the default in SLE12 will be. I'll refer questions about that to our product manager for SLES, Matthias Eckermann. It should be apparent that SUSE is invested in the success of btrfs, though.
Indeed, the decision to focus on btrfs has been made more than three years ago. The primary driver for this was the Copy on Write functionality, and the benefits you can get out of this for the operating system: snapshots for package installation and administrative changes. Many years ago I visited a datacenter in the financial sector in Germany, and they had implemented a mechanism to prevent "surprises" due to operating system updates. Basically they had multiple LVM-LVs for "/" and "/boot" and rsync-ed the currently active volumes before updating the kernel and other critical parts, thus they always had a "well known state". I know that many people and companies have implemented this the one or the other way. With btrfs, snapshots, zypper integration, and the ability to boot off btrfs snapshots (not yet, but soon, hopefully), this shall be available for everybody, in a consistent and integrated way -- and without extra effort on the user's side. Accordingly, the plan is indeed to make btrfs the default filesystem for the operating system in SUSE Linux Enterprise 12. <advertisement> A discussion about this will also be part of my presentation at this year's LinuxCon/US, see: http://linuxconcloudopenna2013.sched.org/event/0e707c607eb3fd1cb06664517724e... </advertisement> so long - MgE, proudly using btrfs as root and home. -- Matthias G. Eckermann Senior Product Manager SUSE® Linux Enterprise SUSE LINUX Products GmbH Maxfeldstraße 5 90409 Nürnberg Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)
Hi Jeff,
On September 5, 2013 at 1:48 AM Jeff Mahoney <jeffm@suse.com> wrote: [...] What I do object to is people saying "no" without having an actual reason for saying so other than being worried about it.
You know file systems have ever been almost a religious thing, and mankind is behaving like a flock of sheep: if one individual comes up with something new, then it must be (much) better than what the masses are using to move their standpoint. They would even accept known shortcomings with the current situation before being in uncharted waters. I mean when someone is happy with EXT3/4, XFS or whatever in your current installation, then why should [s]he change to something new? With something new, [s]he will always feel a bit uncomfortable. Humans are creatures of habit. Thus, you'll have to convince them why BTRFS is faster, more solid, better maintainable, ... or whatsoever; short - and in analogy to the above sheep - why BTRFS would be better than EXT4 as default. The above is not intended to judge pro / con BTRFS BTW. Have a nice day, Berny -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Wed, 04 Sep 2013 19:48:52 -0400, Jeff Mahoney wrote:
I don't think it's "overly conservative" though to be cautious about risking your data. I think it's up to those who believe it's stable to demonstrate that it is, and to assure the users that this is a safe filesystem to use.
If it is, then of course, I'd want to use it. But I don't want to take a bigger risk so there can be more data gathered about unrecoverable problems. Does that make sense?
I agree that it's not overly conservative to want to preserve your data. That's the baseline of what you expect from a file system. What I do object to is people saying "no" without having an actual reason for saying so other than being worried about it. Worry is fine, but depending on worries as a data set is problematic.
That's fair, but not everyone has the depth of knowledge of btrfs that you and the other developers do. So the actual reasons tend to be "I heard that" rather than personal experience. That's why a discussion like this is important to have - it gives you the chance to explain why the perceptions are incorrect now, and a chance for those who aren't into the guts of filesystem development to understand how things are progressing. I've been around online support communities long enough to know that (as I said earlier) we generally don't get reports of "everything's fine", so the data tends to be a bit skewed and highly anecdotal. Not everyone has that view on online support, though - as evidenced by the number of people who come in and find a thread with their specific problem and decry the testers (not realizing they /are/ part of the testing team in a way) for not finding their specific problem before they did.
What are the next steps after taking the filesystem offline in this instance?
Like every file system, it depends on the error case. This is the same reaction to errors that xfs, ext3, ext4, reiserfs, etc have when they don't want to risk corrupting your data. If it's a disk failure or media access error, that needs to be corrected. If it's corruption, that also needs to be corrected. If it's an ENOMEM, that gets more difficult and that's an area in which I want to invest more effort into avoiding, but it's low priority right now. For the most part, though, the allocation requests are small. I've never seen an ENOMEM in the middle of btrfs actually happen.
In most cases, if the error can be corrected without a reboot, a simple umount <fix> remount cycle will be enough.
Makes sense, thanks. :)
Yeah, we definitely plan on doing wider stress tests in the coming months.
Great. :)
From where I sit, the expectation isn't to eliminate the possibility of errors - but to look for something that's maybe one step better than "good enough".
Agreed. This is an area where we plan to invest more effort.
That's also good to hear.
Would it make a difference if one used a preallocated disk image rather than a dynamic image?
No. The issue isn't the initial allocation, it's the CoW for writes into the image file.
Ah, I think I understand now.
That sounds like a cool feature. Has anyone played with this on, say, truecrypt encrypted devices as yet? (I have a very large truecrypt encrypted volume that I know has some duplication of data on it, and scripting to remove the duplicates, while not difficult, is something I haven't taken the time to do yet.
Not specifically, AFAIK.
OK, maybe if I have some time in the coming weeks, I can create a small volume and try formatting it with btrfs to see if it works (the nice thing about truecrypt is the file-based container option for doing tests like this).
Most of that is over my head - what's the bottom line/impact on this?
The bottom line is that w/o this enabled, things that use a lot of hard links in a single directory can run into EMLINK. You can fix the issue by enabling the extended inode ref feature, but it means that you can mount the file system on older kernels. "Older" in this case means prior to 3.7 IIRC, so oS 12.3.
Ah, OK.
Some general performance numbers would be good to see - as well as performance on large files/small files.
Once we have something we can publish, I'll be happy to share them.
That sounds good.
The baseline off-the-cuff performance shows that performance is similar to ext3 for some workloads, and way off in others, specifically those that are unlink-heavy.
Interesting - is that because of the snapshots, or something else (I honestly haven't looked too deeply at the snapshot functionality, so may need to set up a test system to play with this on a bit).
It's good to hear success stories. I'm curious - do you back this up, or is most of the data available elsewhere in the event of an unrecoverable issue? (Of course, "unrecoverable" for you is probably different than it would be for me, since you know the filesystem well enough to manually work on it if necessary).
I don't have backups for anything but time machine bundles, my mail mirror, and photos. The music and videos can be reproduced in a time-consuming manner. Mostly it's a matter of the price of backup space being more expensive than I want to spend. But, yeah, I do have the luxury of knowing where to start to fix it manually if I must.
Similar situation to what I'm in with my 2 TB external drive, it sounds like.
Given that the work for btrfs in SLE and openSUSE is being handled largely by the same people, I think it makes sense to make the comparison.
SLE doesn't yet default to btrfs, though, does it?
SLE11 defaults to ext3 and we don't change the default in a service pack. I can't comment on what the default in SLE12 will be. I'll refer questions about that to our product manager for SLES, Matthias Eckermann. It should be apparent that SUSE is invested in the success of btrfs, though.
Yep, absolutely. openSUSE is the upstream so it does become a significant part of the testbed for SLE as well, so there's certainly a balance to be achieved.
It's good to have the conversation - thank you for the detailed explanation of things. That really helps put my mind at ease that this isn't (as we see from time to time) a "throw it over the wall and see what breaks" approach. It sounds like you've really done your homework and stand behind what you and your team have done to make btrfs production quality. While I still have reservations (and probably will until it reaches some sort of critical mass), my concerns are largely addressed.
Exactly. This is a file system which we've seen deployed with SLE11 SP2/3 and with which we've seen pretty good results. It's also something that we put significant effort into improving even after the initial release of the service pack.
That's good to hear. :) Jim -- Jim Henderson Please keep on-topic replies on the list so everyone benefits -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Wed, Sep 4, 2013 at 3:22 PM, Jim Henderson <hendersj@gmail.com> wrote:
Areas that still need work: - Error handling - Not in the handling failure cases sense, but in the fsfuzzer sense.
- btrfsck - As I mentioned, we need broken file systems to fix in order to improve the tool.
- General performance - For a root file system with general user activity, it performs reasonably well. I've asked one of my team to come up with solid performance numbers so that we can 1) demonstrate where the file system is performing relative to the usual suspects, and 2) identify where we need to focus our efforts. - Historically, fsync() was a problem spot but that's been mitigated with the introduction of a "tree log" that is similar to a journal but is really just used to accelerate fsync.
Some general performance numbers would be good to see - as well as performance on large files/small files.
I can probably make room for a BtrFS partition to run my toy (100G) database in, and test performance. But I'll need the newest kernel for this, and I never run the newest kernel on my work machine (upgrading regularly being low priority). Since Stephan pretty much decreed 13.1 won't use it as default, I guess I'll have 13.1's whole lifetime to do the upgrading and testing. I'll make sure to test with fsync=on (postgres here). -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Sep 04, 13 18:22:02 +0000, Jim Henderson wrote: [...]
SLE doesn't yet default to btrfs, though, does it?
[...] That's because SLE has not seen a major new code stream for 3 years now. And with SLE 11 GA btrfs was not in a state where it was fit as default filesystem ;) Meanwhile, this has changed, and we are currently planning to switch the default filesystem with SLE 12. As I'm one of the guys who will have to cope with that, I was carefully looking at the choices. And I don't think btrfs is a bad choice. (I agree that for some usage scenarios other filesystems are better suited. But that is expectable.) I've read a lot about "choice" and "free to choose" in this thread. For me, as a user, without the ability to fix filesystems as Jeff does, stability and reliability is important. If I get this by not activating all features of an application, or filesystem, then I'm happy. In my opinion, you can either regard it as "restricting" the filesystem if you have a parameter to activate the immature features. Or you can see it as "you get a stable reliable filesystem, and we give you the choice to use the developing parts, too, if you want!". All a matter of perspective, isn't it? ciao, Stefan -- Stefan Behlert, SUSE LINUX Project Manager Maxfeldstr. 5, D-90409 Nuernberg, Germany Phone +49-911-74053-173 SUSE LINUX Products GmbH, Nuernberg; GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 16746 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
В Wed, 04 Sep 2013 12:04:58 -0400 Jeff Mahoney <jeffm@suse.com> пишет:
- VM image performance - Performance is generally regarded as horrible. - This is because CoW on what is essentially a block device backing store means a ton of write amplification for each write that the VM issues.
Why VM is special cased here? What makes it so different from any normal write activity? Did anyone look at what Solaris does? They must have hit the same problem with LDOM on ZFS years ago.
On 9/4/13 10:49 PM, Andrey Borzenkov wrote:
В Wed, 04 Sep 2013 12:04:58 -0400 Jeff Mahoney <jeffm@suse.com> пишет:
- VM image performance - Performance is generally regarded as horrible. - This is because CoW on what is essentially a block device backing store means a ton of write amplification for each write that the VM issues.
Why VM is special cased here? What makes it so different from any normal write activity?
That it's significant overwrite activity with a strong desire for snapshot functionality. The typical database workload wants /just/ overwrite. The typical VM workload wants both.
Did anyone look at what Solaris does? They must have hit the same problem with LDOM on ZFS years ago.
Nope. We haven't examined ZFS behavior. That gets tricky WRT legality. -Jeff -- Jeff Mahoney SUSE Labs
On Tue, Sep 3, 2013 at 8:59 PM, Jeff Mahoney <jeffm@suse.com> wrote:
On 9/3/13 7:00 PM, Jim Henderson wrote:
On Tue, 03 Sep 2013 19:55:49 -0300, Claudio Freire wrote:
Also, consider the target audience of default filesystems.
These aren't for the power users, which will consciously pick the filesystem they like best. Default filesystems are for the granny and the newbie, those that cannot and will not fix low-level filesystem issues.
I haven't used BtrFS recently, but unless it's granny-proof, I'd think twice before making it default.
+1. Changing the default to btrfs is going to increase the number of people having problems posting in the forums. It still seems to be considered "unstable" or "experimental", and if so, shouldn't be selected as the default.
Well that's the main thrust behind the "allow unsupported" module option. We have the feature set that we've evaluated to be mature and that's what we allow by default.
When I cast a wide net across forums and mailing lists last month asking for user experiences, I got a lot of uninformed opinion and very little concrete data. Most of the negative data was in the area of snapper being too aggressive in creating snapshots and not aggressive enough in cleaning them up. There was some negative opinion WRT the file system itself, but most of it was in the realm of "I heard..." or "I don't trust it" based on too much hearsay and too little experience. It's that kind of rumor-response that is unhelpful in making decisions or improving the pain points with the file system. There were a few reports of people having troubles with the file system itself, but they tended to be with compression or RAID enabled -- the features that we don't entirely trust yet and want to disable so the casual user doesn't become an unwitting beta tester.
Well, while I have no concrete data, I do have an idea of how granny-type users treat their systems. One of the most important things to test in BtrFS before considering it default-able, is its crash resilience. If it can withstand power outages (which users will consciously induce and abuse, like my dad, who thought the quickest way to turn off his computer was holding the power button for 5 seconds until I screamed at him), then that's certainly a good sign. If it can't, not. I'm sure there's ways to test this with OpenQA too - just launch a VM, make it work intensively on disk, stop it forcefully without giving it a chance to whatever, and re-launch. See what happens. Is this doable with OpenQA? -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Hi, I know I am a bit late, I was away.... On 09/03/2013 07:59 PM, Jeff Mahoney wrote:
On 9/3/13 7:00 PM, Jim Henderson wrote:
On Tue, 03 Sep 2013 19:55:49 -0300, Claudio Freire wrote:
Also, consider the target audience of default filesystems.
These aren't for the power users, which will consciously pick the filesystem they like best. Default filesystems are for the granny and the newbie, those that cannot and will not fix low-level filesystem issues.
I haven't used BtrFS recently, but unless it's granny-proof, I'd think twice before making it default.
+1. Changing the default to btrfs is going to increase the number of people having problems posting in the forums. It still seems to be considered "unstable" or "experimental", and if so, shouldn't be selected as the default.
Well that's the main thrust behind the "allow unsupported" module option. We have the feature set that we've evaluated to be mature and that's what we allow by default.
When I cast a wide net across forums and mailing lists last month asking for user experiences, I got a lot of uninformed opinion and very little concrete data. Most of the negative data was in the area of snapper being too aggressive in creating snapshots and not aggressive enough in cleaning them up. There was some negative opinion WRT the file system itself, but most of it was in the realm of "I heard..." or "I don't trust it" based on too much hearsay and too little experience. It's that kind of rumor-response that is unhelpful in making decisions or improving the pain points with the file system.
I agree with you that the "rumor and armchair" comments, perceptions and responses are not useful for improving the state of btrfs. However, the "rumor and armchair" perceptions are exactly what we have to overcome. If everyone thinks it's "unstable", whether true or not, and we announce "openSUSE 13.1 has Btrfs as default" we will have a big perception problem and for better or worse "perception is reality." In principle separating the features into "rock solid" and "immature" and enabling only the "rock solid" features when choosing btrfs as default will address usage issues, but will do little for the perception issue. Action to address usage in and of itself is most often insufficient and appropriate messaging has to go with it.
There were a few reports of people having troubles with the file system itself, but they tended to be with compression or RAID enabled -- the features that we don't entirely trust yet and want to disable so the casual user doesn't become an unwitting beta tester.
So whether it's "considered" unstable or experimental largely depends on what features are being tested and who's doing the testing. A lot of times it involves armchair punditry and no testing at all.
Unfortunately there is no getting away from "Monday morning quarterbacks" and we will have to overcome the perception of "unstable" and/or "not for granny". This, for better or worse also includes the tools around btrfs, such as snapper. See the comment in this thread with the 20GB VM that runs out of disk space when running "zypper up" (I had that same experience by the way and it is very annoying.) Things like the snapper problem reflect on the filesystem and thus give it a probably undeserved bad rep. Therefore, focusing on the filesystem alone in this case is probably insufficient. In the end I think the "granny question" is valid. If we make btrfs the default for 13.1, will granny run into any issues? Does the drive fill up because of created snapshots? Does she run out of space although there should be plenty of space on the drive? It's the more "silly" stuff we have to worry about. My $0.02 Robert -- Robert Schweikert MAY THE SOURCE BE WITH YOU SUSE-IBM Software Integration Center LINUX Tech Lead rjschwei@suse.com rschweik@ca.ibm.com 781-464-8147 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday, 2013-09-10 at 12:16 -0400, Robert Schweikert wrote:
In the end I think the "granny question" is valid.
If we make btrfs the default for 13.1, will granny run into any issues?
Does the drive fill up because of created snapshots?
It does.
Does she run out of space although there should be plenty of space on the drive?
Yes, she does. The issue has happened several times in the forums, people that complain that their filesystem is full despite reporting 1/3 used. The users were sometimes unaware they were using btrfs. Thus YaST should suggest bigger sizes, or the tasks that clear out old snapshots be modified to avoid that situation. - -- Cheers, Carlos E. R. (from 12.3 x86_64 "Dartmouth" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEARECAAYFAlIvlhwACgkQtTMYHG2NR9WtjQCcDl27YXVj5nXRC99KOGstC+Qs TPgAoIwyEsgguLj1jMouWfTb4ufREVZJ =a6GQ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue 10 Sep 2013 11:58:52 PM CDT, Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tuesday, 2013-09-10 at 12:16 -0400, Robert Schweikert wrote:
In the end I think the "granny question" is valid.
If we make btrfs the default for 13.1, will granny run into any issues?
Does the drive fill up because of created snapshots?
It does.
Does she run out of space although there should be plenty of space on the drive?
Yes, she does.
The issue has happened several times in the forums, people that complain that their filesystem is full despite reporting 1/3 used.
The users were sometimes unaware they were using btrfs.
Thus YaST should suggest bigger sizes, or the tasks that clear out old snapshots be modified to avoid that situation.
Hi In the recent case the user had a smaller partition than the recommended size @11GB, so if a user over rides, not a lot you can do? So on this system running SLED 11 SP3/btrfs my / is 19.5 GB with currently 86 snapshots and disk usage at 42% on the openSUSE system I used 30GB for / and it's at 25% with 39 snapshots. -- Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890) SUSE Linux Enterprise Desktop 11 (x86_64) GNOME 2.28.0 Kernel 3.0.82-0.7-default up 6:49, 4 users, load average: 0.30, 0.26, 0.30 CPU AMD E2-1800@1.7GHz | GPU ATI Wrestler [Radeon HD 7340] -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday, 2013-09-10 at 17:43 -0500, Malcolm wrote:
On Tue 10 Sep 2013 11:58:52 PM CDT, Carlos E. R. wrote:
Hi In the recent case the user had a smaller partition than the recommended size @11GB, so if a user over rides, not a lot you can do?
That's a particular case. The thing is that, during installation, first you choose a size, then a filesystem. But if the choice is btrfs, the size should be about doubled, or at least, checked. Or, some system should be devised to adjust autoprunning of snapshots when the filesystem is filling up, or somehow, the user should get a warning about what to do, and not be taken by surprise when his system complain of full partition when tools say it is not. - -- Cheers, Carlos E. R. (from 12.3 x86_64 "Dartmouth" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEARECAAYFAlIvrD8ACgkQtTMYHG2NR9W9OgCfdOB3hG/9MhSVWTolSZVSN73G D48An39aYgMs2q4hL84pdK2bi9CcXGO8 =+MbV -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 10, 2013 at 6:16 PM, Robert Schweikert <rjschwei@suse.com> wrote:
I agree with you that the "rumor and armchair" comments, perceptions and responses are not useful for improving the state of btrfs. However, the "rumor and armchair" perceptions are exactly what we have to overcome. If everyone thinks it's "unstable", whether true or not, and we announce "openSUSE 13.1 has Btrfs as default" we will have a big perception problem and for better or worse "perception is reality."
We have time to sort things out. According to this http://lists.opensuse.org/opensuse-factory/2013-09/msg00084.html openSUSE will not have BTFS as default in 13.1 regardless - it is way too late in the release cycle to be setting an untested/new default filesystem. We have until 13.2 to sort out the available/used features, stability, perception, and the lingering usability bugs. C. -- openSUSE 12.3 x86_64, KDE 4.11 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Robert Schweikert wrote:
Hi,
I know I am a bit late, I was away.... [big snip]
In the end I think the "granny question" is valid.
If we make btrfs the default for 13.1, will granny run into any issues?
I was only just yesterday reading through this lengthy thread, and I have to agree with Robert, the "granny question" is valid. -- Per Jessen, Zürich (12.4°C) http://www.dns24.ch/ - free DNS hosting, made in Switzerland. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tuesday 10 of September 2013 12:16EN, Robert Schweikert wrote:
In the end I think the "granny question" is valid.
If we make btrfs the default for 13.1, will granny run into any issues?
Does the drive fill up because of created snapshots?
As I have written in the beginning, we should distinguish between two completely different question: 1. should btrfs be default filesystem? 2. should default configuration automatically create snapshots (periodically or on every update)? I still think yes to 1 doesn't automatically mean yes to 2. And while I'm not strictly against 1 (even if I'm still probably not going to use btrfs for my data for some time), I would consider 2 bad idea - about as bad as when some time ago "zypper ar" used --keep-packages by default. Michal Kubeček -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
CAN WE PLEASE KEEP ALL REPLIES ON THE LIST ONLY (NO PM REPLIES). MULTIPLE COPIES ARE NOT NEEDED! On 09/03/2013 06:55 PM, Claudio Freire pecked at the keyboard and wrote:
On Tue, Sep 3, 2013 at 7:41 PM, Jon Nelson <jnelson-suse@jamponi.net> wrote:
Please do not take my comments here to mean that I don't immensely appreciate the work done by you and many others, but I believe it is not unreasonable to say that we probably disagree as to whether or not btrfs is stable enough to be made the default filesystem.
Also, consider the target audience of default filesystems.
These aren't for the power users, which will consciously pick the filesystem they like best. Default filesystems are for the granny and the newbie, those that cannot and will not fix low-level filesystem issues.
I haven't used BtrFS recently, but unless it's granny-proof, I'd think twice before making it default.
-- Ken Schneider SuSe since Version 5.2, June 1998 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 3, 2013 at 8:04 PM, Ken Schneider - openSUSE <suse-list3@bout-tyme.net> wrote:
CAN WE PLEASE KEEP ALL REPLIES ON THE LIST ONLY (NO PM REPLIES). MULTIPLE COPIES ARE NOT NEEDED!
Sorry, I've got reply-all configured by default, and it's rather difficult remembering which lists prefer reply-all and which only to-list. And which set up the headers to force reply to-list, and which... you get my point. Whatever client doesn't filter duplicates, needs an upgrade. But my apologies anyway. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tue, Sep 3, 2013 at 2:54 PM, Ken Schneider - openSUSE <suse-list3@bout-tyme.net> wrote:
On 09/03/2013 10:32 AM, Jeff Mahoney pecked at the keyboard and wrote:
Hi all -
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I posted a followup question a week or two later asking what people thought about limiting the 'supported' feature set in the way we do in SLES so that it's clear to all users which parts of the file system are considered stable.
A quick table of what that looks like:
Supported Unsupported --------- ----------- Snapshots Inode cache Copy-on-Write Auto Defrag Subvolumes RAID Metadata Integrity Compression Data Integrity Send / Receive Online metadata scrubbing Hot add/remove Manual defrag Seeding devices Manual deduplication (soon) Multiple devices "Big" Metadata (supported read-only)
Over time this table will change. Items from the Unsupported list will move to the Supported list as they mature.
That proposal was pretty well received except, predictably, by those using the features listed. In practice, all that's required for those users to continue uninterrupted is to add the 'allow_unsupported=1' option to the btrfs module either on the kernel command line or /etc/modprobe.d. There is nothing inherently limiting to any openSUSE user with this practice. The features are all still in the code and available immediately just by setting a flag. It can even be done safely after module load or even after file systems that don't use the unsupported features have been mounted. I intend to introduce this functionality into openSUSE soon.
One other aspect to consider: Even though they are independent projects, we've been focusing heavily on btrfs support in the SLES product. As a result, the openSUSE kernel will end up getting much of that work 'for free' since most of the same people maintain the kernel for both projects.
So that's the "why it's safe" part of the proposal. I haven't gotten to the "why" yet, but then you probably already know the "whys". Subvolumes. Built-in snapshots that don't corrupt themselves when an exception table runs out of space. Built-in integrity verification via checksums. Built-in proactive metadata semantic checking via scrubbing. Online defrag. Soon we'll see online deduplication of arbitrary combinations of files. The code is written, it just needs to be pulled in. You've seen the rest of the feature set. Once we test more of it under load and ensure that it's mature enough to roll out, you'll get those features for free.
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Thanks for your time.
-Jeff
Not as long as any items are in the unsupported colume and as long as there is no tool to repair a broken filesystem.
Ken, I might agree with "No, not as long as there are items in the right hand column that are functional with the current default filesystem (ext4)" As far as I know, none of the right hand column items are in ext4, so their stability / usability is besides the point. As to the quality of btrfsck, I don't know the quality, but it was first released 2 years ago: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg11836.html I keep reading it doesn't have one, but never any specifics. (but I haven't gone looking either). Greg -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03.09.2013 16:32, Jeff Mahoney wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Sorry, but if you read the roadmap more carefully, you see that we're way past the time to change important details as that. For convenience I include a pointer: http://en.opensuse.org/openSUSE:Roadmap You're free to offer your help to the marketing team to promote btrfs as perfect choice for experienced users and we switch early for 13.2. That's the best I'm willing to offer. Greetings, Stephan -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iD8DBQFSJuhDwFSBhlBjoJYRAn4NAKDZt9NNev46D5FaBgJ8DgQG85ZbSgCfR0yo A4Y0Js+qQAqbUSyxV42/dDE= =Pcsj -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 04.09.2013 09:59, Stephan Kulow wrote:
On 03.09.2013 16:32, Jeff Mahoney wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Sorry, but if you read the roadmap more carefully, you see that we're way past the time to change important details as that. For convenience I include a pointer: http://en.opensuse.org/openSUSE:Roadmap
You're free to offer your help to the marketing team to promote btrfs as perfect choice for experienced users and we switch early for 13.2. That's the best I'm willing to offer.
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695 Btrfs is really young ;( Greetings, Stephan -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iD8DBQFSKFHvwFSBhlBjoJYRAtelAJ4qLVVnALfgwF2xzn9T0wcAa/domgCg3KsS 4/YxZ+oJ++CzCOD+MQ/vwAc= =EnTU -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Thu, Sep 05, 2013 at 11:42:08AM +0200, Stephan Kulow wrote:
On 04.09.2013 09:59, Stephan Kulow wrote:
On 03.09.2013 16:32, Jeff Mahoney wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Sorry, but if you read the roadmap more carefully, you see that we're way past the time to change important details as that. For convenience I include a pointer: http://en.opensuse.org/openSUSE:Roadmap
You're free to offer your help to the marketing team to promote btrfs as perfect choice for experienced users and we switch early for 13.2. That's the best I'm willing to offer.
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695
Btrfs is really young ;(
I thought hardlinks were now working, apparently not. Ciao, Marcus -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/5/13 5:42 AM, Stephan Kulow wrote:
On 04.09.2013 09:59, Stephan Kulow wrote:
On 03.09.2013 16:32, Jeff Mahoney wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Sorry, but if you read the roadmap more carefully, you see that we're way past the time to change important details as that. For convenience I include a pointer: http://en.opensuse.org/openSUSE:Roadmap
You're free to offer your help to the marketing team to promote btrfs as perfect choice for experienced users and we switch early for 13.2. That's the best I'm willing to offer.
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695
Btrfs is really young ;(
Maybe you'd care to read the rest of the thread before commenting?
- Removal of the strange per-directory hard link limit - Due to the backreferences to a single inode needing to fit in a single file system block, there was a limit to the number of hard links in a single directory. It could be quite low. - Limit removed by adding a new extended inode ref item, not enabled by default yet since it's a disk format change. Extended inode ref only used when required since it's not as space-efficient as the single node item. There's probably room for discussion within the file system community on whether we'd want to add an "ok to change" bit so that file systems have the ability to use the new extended inode ref items when needed but doesn't set the incompat bit until they're actually used. The other side of that coin is that it may not be clear to users when/if their file system has become incompatible with older kernels.
We've been discussing enabling this by default and I have a patch set that will allow the installer to enable it dynamically on a mounted file system. -Jeff -- Jeff Mahoney SUSE Labs
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05.09.2013 15:54, Jeff Mahoney wrote:
On 9/5/13 5:42 AM, Stephan Kulow wrote:
On 04.09.2013 09:59, Stephan Kulow wrote:
On 03.09.2013 16:32, Jeff Mahoney wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Sorry, but if you read the roadmap more carefully, you see that we're way past the time to change important details as that. For convenience I include a pointer: http://en.opensuse.org/openSUSE:Roadmap
You're free to offer your help to the marketing team to promote btrfs as perfect choice for experienced users and we switch early for 13.2. That's the best I'm willing to offer.
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695
Btrfs is really young ;(
Maybe you'd care to read the rest of the thread before commenting?
Well, taking that at least 3 colleagues were not aware of that problem, I felt raising its visibility was needed. Actually all btrfs tests were failing since what I remember: It used to look like this: http://openqa.opensuse.org/viewimg/openqa/testresults/openSUSE-Factory-NET-i... And now it looks like this, but it's actually the hard link problem (and yast not being able to cope with the exception): http://openqa.opensuse.org/viewimg/openqa/testresults/openSUSE-Factory-NET-x... So bluntly we have 0 install experience with btrfs in openSUSE Factory. In early July you asked for complaints and you got some - and now the opportunity for 13.1 is gone IMO.
We've been discussing enabling this by default and I have a patch set that will allow the installer to enable it dynamically on a mounted file system. Yeah, such things you do to get a feature in if your early tests fail, but not if you consider including a late feature.
Greetings, Stephan -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iD8DBQFSKJ2IwFSBhlBjoJYRAkKcAKDPUF2PXFgmWycQgGvpLCchIeE25ACfXl38 HiI0iMg6YF1w5Uot8CqUgp0= =isxH -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Thu, Sep 05, Stephan Kulow wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 05.09.2013 15:54, Jeff Mahoney wrote:
On 9/5/13 5:42 AM, Stephan Kulow wrote:
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695
Btrfs is really young ;(
Maybe you'd care to read the rest of the thread before commenting?
Well, taking that at least 3 colleagues were not aware of that problem, I felt raising its visibility was needed.
I don't understand why you say this is a btrfs problem. It is clearly a libzypp/YaST problem: libzypp/YaST shouldn't crash if RPM has problems to install a package. This can happen everytime again on every filesystem with a different package. So if you complain about btrfs, where are the complains that this much more important system management bug is still not fixed? Thorsten -- Thorsten Kukuk, Senior Architect SLES & Common Code Base SUSE LINUX Products GmbH, Maxfeldstr. 5, D-90409 Nuernberg GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Am 12.09.2013 11:25, schrieb Thorsten Kukuk:
On Thu, Sep 05, Stephan Kulow wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 05.09.2013 15:54, Jeff Mahoney wrote:
On 9/5/13 5:42 AM, Stephan Kulow wrote:
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695
Btrfs is really young ;(
Maybe you'd care to read the rest of the thread before commenting?
Well, taking that at least 3 colleagues were not aware of that problem, I felt raising its visibility was needed.
I don't understand why you say this is a btrfs problem. It is clearly a libzypp/YaST problem: libzypp/YaST shouldn't crash if RPM has problems to install a package. This can happen everytime again on every filesystem with a different package.
So if you complain about btrfs, where are the complains that this much more important system management bug is still not fixed?
The bug *is* fixed - https://build.opensuse.org/request/show/198042 And yes, yast should have given an error dialog about the problem - still that the error is there is btrfs' fault. But that problem is fixed too - it's still a problem on updates though. We need to make sure not to use too many hardlinks not to break zypper dups from 12.3 on btrfs. Greetings, Stephan -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday, 2013-09-05 at 11:42 +0200, Stephan Kulow wrote:
On 04.09.2013 09:59, Stephan Kulow wrote:
On 03.09.2013 16:32, Jeff Mahoney wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Sorry, but if you read the roadmap more carefully, you see that we're way past the time to change important details as that. For convenience I include a pointer: http://en.opensuse.org/openSUSE:Roadmap
You're free to offer your help to the marketing team to promote btrfs as perfect choice for experienced users and we switch early for 13.2. That's the best I'm willing to offer.
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695
Btrfs is really young ;(
I thought the decision was not to make btrfs the default this time. However, when installing 13.1, YaST asks to make btrfs the default choice. This is confusing. :-? - -- Cheers, Carlos E. R. (from 12.3 x86_64 "Dartmouth" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEARECAAYFAlJGB6AACgkQtTMYHG2NR9X4vQCfQFGvrNwiVBdNmOiJGyV2yFTv avwAnRsmZ6HmN3fMA/YBbPuunslhXfmf =6wS9 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 09/27/2013 06:33 PM, Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Thursday, 2013-09-05 at 11:42 +0200, Stephan Kulow wrote:
On 04.09.2013 09:59, Stephan Kulow wrote:
On 03.09.2013 16:32, Jeff Mahoney wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Sorry, but if you read the roadmap more carefully, you see that we're way past the time to change important details as that. For convenience I include a pointer: http://en.opensuse.org/openSUSE:Roadmap
You're free to offer your help to the marketing team to promote btrfs as perfect choice for experienced users and we switch early for 13.2. That's the best I'm willing to offer.
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695
Btrfs is really young ;(
I thought the decision was not to make btrfs the default this time. However, when installing 13.1, YaST asks to make btrfs the default choice.
This is confusing. :-?
- -- Cheers, Carlos E. R. (from 12.3 x86_64 "Dartmouth" at Telcontar)
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux)
iEYEARECAAYFAlJGB6AACgkQtTMYHG2NR9X4vQCfQFGvrNwiVBdNmOiJGyV2yFTv avwAnRsmZ6HmN3fMA/YBbPuunslhXfmf =6wS9 -----END PGP SIGNATURE----- Coolo changed it to 13.2
-- Cheers! Roman ------------------------------------------- openSUSE -- Get it! Discover it! Share it! ------------------------------------------- http://linuxcounter.net/ #179293 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 09/27/2013 06:33 PM, Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Thursday, 2013-09-05 at 11:42 +0200, Stephan Kulow wrote:
On 04.09.2013 09:59, Stephan Kulow wrote:
On 03.09.2013 16:32, Jeff Mahoney wrote:
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
Sorry, but if you read the roadmap more carefully, you see that we're way past the time to change important details as that. For convenience I include a pointer: http://en.opensuse.org/openSUSE:Roadmap
You're free to offer your help to the marketing team to promote btrfs as perfect choice for experienced users and we switch early for 13.2. That's the best I'm willing to offer.
At the moment you can't even install factory with btfs as default because of bugs as bnc#835695
Btrfs is really young ;(
I thought the decision was not to make btrfs the default this time. However, when installing 13.1, YaST asks to make btrfs the default choice.
This is confusing. :-?
I think the pop up was added to get more people to test btrfs. -- Cheers! Roman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Hi Jeff,
Last month I posted queries to this list (and several other locations, including the forums) asking about people's experiences with btrfs. For the most part it seemed like the experience had improved over time. Most of the concerns were either with interactions with zypper or old perceptions of instability that were based more on old impressions than new testing. With the exception of an ENOSPC issue that had been recently fixed, users actively using the file system seemed pretty satisfied with it.
I wonder a bit how that is different to the current default, ext4. As a convinced user of ext4, what would be the personal reason for _me_ to switch or for the intended target group of openSUSE in general? What did the current default of openSUSE do wrong in order to reconsider the default choice? What are the alternatives to btrfs? How does it meet the majority of requirements better than the current default choice? It seems quite obvious to me that the default choice should be depending on what is suitable for the majority of use cases and provides the needed stability. My personal experience with btrfs and with responsitivity to bug reports was mediocre at best, but my last contact with it was ~ April this year, a lot of things might have improved since then.
So, I'd like to propose that we use btrfs as the default file system for the 13.1 release before we release the first beta.
It has been some time already available in Factory (and older openSUSE) releases. Do we know how big (the percentagewise) the happy btrfs userbase is? How do we avoid risking switching the default to something that only X% of the user base is happy with (with X being in the 5-20% area) ? Thanks, Dirk -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Hello Dirk and all, On 2013-09-06 T 09:51 +0200 Dirk Müller wrote:
[...] what would be the personal reason for _me_ to switch or for the intended target group of openSUSE in general? [...] What are the alternatives to btrfs? How does it meet the majority of requirements better than the current default choice?
Besides Scalability there are other attributes where btrfs exceeds other filesystems. See various comparison tables out there, including the one in my blog from three years ago (yes, it's a bit dated): https://www.suse.com/communities/conversations/data-is-customers-gold/ Yet, there is this _one_ point, which made me switch to btrfs for "/" since Februar 2011: Peace of mind on adminstrative tasks (package updates and installations, configuration changes, ...) based on the snapshots / rollback capability. That's where I personally like btrfs for and where I see unique capabilities. And I changed my /home to btrfs last year for the same reason -- on my company and private systems. I even did some development in this area during last hack week: https://www.suse.com/communities/conversations/menu-du-jour-vivaneau-vert-su... But that might be off topic ...
It seems quite obvious to me that the default choice should be depending on what is suitable for the majority of use cases and provides the needed stability.
I can't complain about btrfs' stability.
It has been some time already available in Factory (and older openSUSE) releases. Do we know how big (the percentagewise) the happy btrfs userbase is? How do we avoid risking switching the default to something that only X% of the user base is happy with (with X being in the 5-20% area) ?
Well. Isn't this chicken-egg question the challenge for every new technology? Do we want to stop innovation based on that challenge? Happy - MgE -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Friday 2013-09-06 16:17, Matthias G. Eckermann wrote:
On 2013-09-06 T 09:51 +0200 Dirk Müller wrote:
[...] what would be the personal reason for _me_ to switch or for the intended target group of openSUSE in general? [...] What are the alternatives to btrfs? How does it meet the majority of requirements better than the current default choice?
Besides Scalability there are other attributes where btrfs exceeds other filesystems. See various comparison tables out there, including the one in my blog from three years ago (yes, it's a bit dated): https://www.suse.com/communities/conversations/data-is-customers-gold/
Since somewhat-dated data seems to be popular, here is another: http://www.youtube.com/watch?v=FegjLbCnoBw in this talk, David Chinner showed at LCA 2012 that btrfs was rather non-scaling (for the features it shares with existing filesystems, IOW, just vanilla storing). -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Hello Jan and all, On 2013-09-06 T 18:35 +0200 Jan Engelhardt wrote:
On Friday 2013-09-06 Matthias G. Eckermann wrote:
On 2013-09-06 T 09:51 +0200 Dirk Müller wrote:
[...] what would be the personal reason for _me_ to switch or for the intended target group of openSUSE in general? [...] What are the alternatives to btrfs? How does it meet the majority of requirements better than the current default choice?
Besides Scalability there are other attributes where btrfs exceeds other filesystems. See various comparison tables out there, including the one in my blog from three years ago (yes, it's a bit dated): https://www.suse.com/communities/conversations/data-is-customers-gold/
Since somewhat-dated data seems to be popular, here is another:
http://www.youtube.com/watch?v=FegjLbCnoBw
in this talk, David Chinner showed at LCA 2012 that btrfs was rather non-scaling (for the features it shares with existing filesystems, IOW, just vanilla storing).
I am afraid, we have a wording issue here: When Dave says "Scalability" in that presentations, he means "Performance" (see his slides). When I say "Scalability" above, and use that word comparing btrfs to the current openSUSE default, I am not talking Performance, but talking about "Scalability" in the sense of filesystem size, dealing with huge amounts of (small) files, ... Hope this explains the different view. so long - MgE -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Fri, Sep 06, 2013 at 07:04:35PM +0200, Matthias G. Eckermann wrote:
I am afraid, we have a wording issue here:
When Dave says "Scalability" in that presentations, he means "Performance" (see his slides).
When I say "Scalability" above, and use that word comparing btrfs to the current openSUSE default, I am not talking Performance, but talking about "Scalability" in the sense of filesystem size, dealing with huge amounts of (small) files, ...
Hope this explains the different view.
Not really. "Scalability" in the sense of huge amounts of small files sense means exactly "Performance" for me, as that's where XFS before was dog slow, i.e. it didn't *scale". M. -- Michael Schroeder mls@suse.de SUSE LINUX Products GmbH, GF Jeff Hawn, HRB 16746 AG Nuernberg main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);} -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Fri, Sep 6, 2013 at 2:07 PM, Michael Schroeder <mls@suse.de> wrote:
On Fri, Sep 06, 2013 at 07:04:35PM +0200, Matthias G. Eckermann wrote:
I am afraid, we have a wording issue here:
When Dave says "Scalability" in that presentations, he means "Performance" (see his slides).
When I say "Scalability" above, and use that word comparing btrfs to the current openSUSE default, I am not talking Performance, but talking about "Scalability" in the sense of filesystem size, dealing with huge amounts of (small) files, ...
Hope this explains the different view.
Not really. "Scalability" in the sense of huge amounts of small files sense means exactly "Performance" for me, as that's where XFS before was dog slow, i.e. it didn't *scale".
Scalability is a heavily abused word. There's both kinds of scalability. Ability to store huge amount of small files. Ability to store small amount of huge files, or huge filesystems overall (PB-size). Performance under those constraints, both in terms of speed, access times, and also in terms of space efficiency. There's also ability and performance when recovering from failure scenarios, which is important on a server. So... the area of scalability is huge, no pun intended. How does BtrFS compare against ext234/XFS? Hard to know on an evolving implementation. Today's benchmarks will be outdated tomorrow. And no single benchmark can cover it all. I'd suggest people use 13.1 to do those benchmarks, but I think I already did. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Hi Matthias,
Besides Scalability there are other attributes where btrfs exceeds other filesystems.
Regarding the scalability part, lets not compare something from 3 years ago, lets compare the 13.1 kernel, kernel 3.11.0. Ext4 has had pretty nice improvements in 3.11 regarding scalability, see http://lkml.indiana.edu/hypermail/linux/kernel/1307.0/00286.html for details. http://www.phoronix.com/scan.php?page=article&item=linux_311_filesystems You might or not like this benchmark, but the headline is pretty clear: "EXT4 wins". Also, did btrfs fix the backlink issue? that seems to be a major scalability burden actually. And just to compare the _scalability_ we're talking about. the corner cases are: BTRFS supports filesytems up to 16384 Petabytes. Ext4 has a slight disadvantage here, only spporting filesystems up to 1024 Petabytes. While that sounds like a serios scalability issue for SLE, it is less of a concern for the typical openSUSE case. Other scalability marks are not that interesting. But if we care about scalability and SSD support, F2FS might be interesting to look at as well.. Greetings, Dirk -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Wed, Sep 11, 2013 at 12:16:57PM +0200, Dirk Müller wrote:
BTRFS supports filesytems up to 16384 Petabytes. Ext4 has a slight disadvantage here, only spporting filesystems up to 1024 Petabytes. While that sounds like a serios scalability issue for SLE, it is less of a concern for the typical openSUSE case.
SLE has only ext3, not ext4. Regards, Arvin -- Arvin Schnell, <aschnell@suse.de> Senior Software Engineer, Research & Development SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) Maxfeldstraße 5 90409 Nürnberg Germany -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/11/13 6:16 AM, Dirk Müller wrote:
Hi Matthias,
Besides Scalability there are other attributes where btrfs exceeds other filesystems.
Regarding the scalability part, lets not compare something from 3 years ago, lets compare the 13.1 kernel, kernel 3.11.0. Ext4 has had pretty nice improvements in 3.11 regarding scalability, see http://lkml.indiana.edu/hypermail/linux/kernel/1307.0/00286.html for details.
http://www.phoronix.com/scan.php?page=article&item=linux_311_filesystems
You might or not like this benchmark, but the headline is pretty clear: "EXT4 wins".
Also, did btrfs fix the backlink issue? that seems to be a major scalability burden actually.
Again, yes: commit f186373fef005cee948a4a39e6a14c2e5f517298 Author: Mark Fasheh <mfasheh@suse.de> Date: Wed Aug 8 11:32:27 2012 -0700 btrfs: extended inode refs This patch adds basic support for extended inode refs. This includes support for link and unlink of the refs, which basically gets us support for rename as well. Inode creation does not need changing - extended refs are only added after the ref array is full. Signed-off-by: Mark Fasheh <mfasheh@suse.de> The current issue is the ability to enable them online. I have patches to do that posted to the btrfs mailing list. -Jeff
And just to compare the _scalability_ we're talking about. the corner cases are:
BTRFS supports filesytems up to 16384 Petabytes. Ext4 has a slight disadvantage here, only spporting filesystems up to 1024 Petabytes. While that sounds like a serios scalability issue for SLE, it is less of a concern for the typical openSUSE case.
Other scalability marks are not that interesting. But if we care about scalability and SSD support, F2FS might be interesting to look at as well..
Greetings, Dirk
-- Jeff Mahoney SUSE Labs
On Wed, Sep 11, 2013 at 12:16:57PM +0200, Dirk Müller wrote:
Also, did btrfs fix the backlink issue? that seems to be a major scalability burden actually.
The feature is turned on by default for 13.1 -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Wed, Sep 11, 2013 at 7:16 AM, Dirk Müller <dirk@dmllr.de> wrote:
Besides Scalability there are other attributes where btrfs exceeds other filesystems.
Regarding the scalability part, lets not compare something from 3 years ago, lets compare the 13.1 kernel, kernel 3.11.0. Ext4 has had pretty nice improvements in 3.11 regarding scalability, see http://lkml.indiana.edu/hypermail/linux/kernel/1307.0/00286.html for details.
http://www.phoronix.com/scan.php?page=article&item=linux_311_filesystems
You might or not like this benchmark, but the headline is pretty clear: "EXT4 wins".
And, they didn't check, but at least for database workloads ext3 beats ext4. So, if btrfs is slower than ext4 by that much, then for databases it's a no-no (ext4 is already a no-no, so perhaps I should say no-no-no-no-no) -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 9/11/13 1:52 PM, Claudio Freire wrote:
On Wed, Sep 11, 2013 at 7:16 AM, Dirk Müller <dirk@dmllr.de> wrote:
Besides Scalability there are other attributes where btrfs exceeds other filesystems.
Regarding the scalability part, lets not compare something from 3 years ago, lets compare the 13.1 kernel, kernel 3.11.0. Ext4 has had pretty nice improvements in 3.11 regarding scalability, see http://lkml.indiana.edu/hypermail/linux/kernel/1307.0/00286.html for details.
http://www.phoronix.com/scan.php?page=article&item=linux_311_filesystems
You might or not like this benchmark, but the headline is pretty clear: "EXT4 wins".
And, they didn't check, but at least for database workloads ext3 beats ext4.
So, if btrfs is slower than ext4 by that much, then for databases it's a no-no (ext4 is already a no-no, so perhaps I should say no-no-no-no-no)
No, it's not currently intended for database workloads. But even then there's some tuning to be done. Big databases run their own checksums and don't care about the data checksumming inside the file system. They also probably want nodatacow enabled on the database files because they really just want space on disk and for the OS to get out of the way otherwise. -Jeff -- Jeff Mahoney SUSE Labs
On Wed, Sep 11, 2013 at 12:52 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Wed, Sep 11, 2013 at 7:16 AM, Dirk Müller <dirk@dmllr.de> wrote:
Besides Scalability there are other attributes where btrfs exceeds other filesystems.
Regarding the scalability part, lets not compare something from 3 years ago, lets compare the 13.1 kernel, kernel 3.11.0. Ext4 has had pretty nice improvements in 3.11 regarding scalability, see http://lkml.indiana.edu/hypermail/linux/kernel/1307.0/00286.html for details.
http://www.phoronix.com/scan.php?page=article&item=linux_311_filesystems
You might or not like this benchmark, but the headline is pretty clear: "EXT4 wins".
And, they didn't check, but at least for database workloads ext3 beats ext4.
Why do you say that? I'm of the impression that ext4 is - highly - recommended over ext3. -- Jon -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Thu, Sep 12, 2013 at 9:10 AM, Jon Nelson <jnelson-suse@jamponi.net> wrote:
On Wed, Sep 11, 2013 at 12:52 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Wed, Sep 11, 2013 at 7:16 AM, Dirk Müller <dirk@dmllr.de> wrote:
Besides Scalability there are other attributes where btrfs exceeds other filesystems.
Regarding the scalability part, lets not compare something from 3 years ago, lets compare the 13.1 kernel, kernel 3.11.0. Ext4 has had pretty nice improvements in 3.11 regarding scalability, see http://lkml.indiana.edu/hypermail/linux/kernel/1307.0/00286.html for details.
http://www.phoronix.com/scan.php?page=article&item=linux_311_filesystems
You might or not like this benchmark, but the headline is pretty clear: "EXT4 wins".
And, they didn't check, but at least for database workloads ext3 beats ext4.
Why do you say that? I'm of the impression that ext4 is - highly - recommended over ext3.
This[0] is old, and not really scientific, but I've heard lots of similar ones (and a bit more scientific) in postgres ML. I might have to run a newer benchmark though. This[1] one's newer, and looks a lot better for ext4. In general, database workloads are special, and the dumber the FS the better. Until not a lot before that[2], ext4 wasn't even crash-safe. ext3 has serious fsync issues, but at least is stable enough in its shortcomings that one can work around them... like separate partitions for WAL, and stuff like that. In fact, behavior under fsync has been one of the main reasons XFS is still the best DB file system option. However, looking at [1], I may have to re-think those recommendations. Btrfs seems to be way behind though. Good thing you prompted me for data. [0] http://serenadetoacuckooo.blogspot.com.ar/2011/04/ext4-performance-and-barri... [1] http://www.ilsistemista.net/index.php/linux-a-unix/21-ext3-ext4-xfs-and-btrf... [2] http://postgresql.1045698.n5.nabble.com/ext4-finally-doing-the-right-thing-t... -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Thu, Sep 12, 2013 at 12:46 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
I might have to run a newer benchmark though. This[1] one's newer, and looks a lot better for ext4.
In general, database workloads are special, and the dumber the FS the better.
...
Good thing you prompted me for data.
[0] http://serenadetoacuckooo.blogspot.com.ar/2011/04/ext4-performance-and-barri... [1] http://www.ilsistemista.net/index.php/linux-a-unix/21-ext3-ext4-xfs-and-btrf... [2] http://postgresql.1045698.n5.nabble.com/ext4-finally-doing-the-right-thing-t...
Do check out the previous page with MySQL benchmarks, showing the opposite. It's not quite so clear-cut anymore (it used to be ext3 winning all the time, but now it's not that ext4 wins all the time either). Curious thing there is btrfs. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Wed, Sep 11, 2013 at 7:16 AM, Dirk Müller <dirk@dmllr.de> wrote:
Besides Scalability there are other attributes where btrfs exceeds other filesystems.
Regarding the scalability part, lets not compare something from 3 years ago, lets compare the 13.1 kernel, kernel 3.11.0. Ext4 has had pretty nice improvements in 3.11 regarding scalability, see http://lkml.indiana.edu/hypermail/linux/kernel/1307.0/00286.html for details.
http://www.phoronix.com/scan.php?page=article&item=linux_311_filesystems
You might or not like this benchmark, but the headline is pretty clear: "EXT4 wins".
And, they didn't check, but at least for database workloads ext3 beats ext4. Do you have any data to share? Preferably also with
[trimmed CC list a bit] On Wed 11-09-13 14:52:45, Claudio Freire wrote: linux-ext4@vger.kernel.org. I'd be interested in tracking that down because I (and I believe other ext4 developers as well) am not aware of this. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
participants (37)
-
Andrey Borzenkov
-
Arvin Schnell
-
Bernhard Voelker
-
Bruce Ferrell
-
C
-
Carlos E. R.
-
Claudio Freire
-
Cristian Rodríguez
-
David Sterba
-
Dirk Müller
-
Frederic Crozat
-
Greg Freemyer
-
Hans Witvliet
-
Jan Engelhardt
-
Jan Kara
-
Jeff Mahoney
-
Jim Henderson
-
Jon Nelson
-
Ken Schneider - openSUSE
-
Lars Müller
-
Lew Wolfgang
-
Linda Walsh
-
Malcolm
-
Marcus Meissner
-
Matthias G. Eckermann
-
Michael Schroeder
-
Michal Kubeček
-
Olaf Hering
-
Per Jessen
-
Richard Biener
-
Robert Schweikert
-
Roman Bysh
-
Stefan Behlert
-
Stefan Seyfried
-
Stephan Kulow
-
Thorsten Kukuk
-
Yamaban