Re: [opensuse] How to repair btrfs (was: Btrfs ...)
On 27 February 2017 at 18:43, Andrei Borzenkov <arvidjaar@gmail.com> wrote:
26.02.2017 14:38, Richard Brown пишет:
Good. So as you appear to be expert in repairng corrupted btrfs - after reboot I get "parent transid failed". Without doing anything in between at all - I just booted and after 10 minutes did "reboot". Of course, even grub does not load now as it itself is located on btrfs and cannot read its modules.
Your suggestion? OK I must not use btrfcsk. What I *must* use now?
The below are the steps I would recommend for ANY btrfs issue, smart people reading dmesg or syslog can probably figure out which of these steps they'd need to skip to in order to fix their particular problem. Step 1 - boot to a suitable alternative system, such as a different installation of openSUSE, a liveCD, or an openSUSE installation DVD. The installation DVD for the version of openSUSE you are running is usually the best choice as it will certainly use the same kernel/btrfs version. Step 2 - Go to a suitable console and make sure you do the below as root Step 3 - Try to mount your partition to /mnt, just to confirm it's really broken (eg. "mount /dev/sda1 /mnt") Step 4 - If it mounts - are you sure it's broken? if Yes - run "btrfs scrub start /mnt" to scrub the system, and "btrfs scrub status /mnt" to monitor it Step 5 - If it doesn't mount, try to scrub the device just in case it works (eg. "btrfs scrub start /dev/sda1" and "btrfs scrub status /dev/sda1" to monitor). Try mounting, if yes, you're fixed. Step 6 - If scrubbing is not an option or does not resolve the issue then try "mount -o usebackuproot" instead (eg. "mount -o usebackuproot /dev/sda1 /mnt"). ==Interlude== All of the above steps are considered safe and should make no destructive changes to disk, and have fixed every filesystem issue I've had on btrfs in the last 5 years Full disk issues need a different approach (basically, delete stuff ;)) documented here: https://www.suse.com/documentation/sles-12/stor_admin/data/sect_filesystems_... If the above doesn't fix things for you, you can continue with the below steps but the situation is serious enough to justify a bug report, please! == Step 7 - Run "btrfs check <device>" (eg. "btrfs check /dev/sda1"). This isn't going to help, but save the log somewhere, it will be useful for the bug report. Step 8 - Seriously consider running "btrfs restore <device> <somewhere to copy data>" (eg. "btrfs restore /dev/sda1 /mnt/usbdrive"). This won't fix anything but it will scan the filesystem and recover everything it can to the mounted device. This especially useful if your btrfs issues are actually caused by failing hardware and not btrfs fault. Step 9 - Run "btrfs rescue super-recover <device>" (eg. "btrfs rescue super-recover /dev/sda1"). Then try to mount the device normally. If it works, stop going. Step 10 - Run "btrfs rescue zero-log <device>" (eg. "btrfs rescue zero-log /dev/sda1"). Then try to mount the device normally. If it works, stop going. Step 11 - Run "btrfs rescue chunk-recover <device>" (eg. "btrfs rescue chunk-recover /dev/sda1"). This will take a LONG while. Then try to mount the device normally. If it works, stop going. Step 12 - Don't just consider it this time, don't be an idiot, run "btrfs restore <device> <somewhere to copy data>" (eg. "btrfs restore /dev/sda1 /mnt/usbdrive"). Step 13 - Seriously, don't be an idiot, use btrfs restore ==Danger zone= The above tools had a small chance of making unwelcome changes, but now you're in the seriously suicidal territory, do not do the following unless you're prepared to accept the consequences of your choice. == Step 14 - Now, ONLY NOW, try btrfsck aka "btrfs check --repair <device>" (eg. "btrfs check --repair /dev/sda1") There, I'm very confident the above will help you Andrei, and for everyone else, can we please bury the nonsense that btrfs is lacking when it comes to repair and recovery tools? You have scrub which is safe for day to day use, the perfectly safe usebackuproot mount option, and the various "btrfs rescue" commands which are only moderately worrying compared to the practical Russian roulette which is "btrfs check" -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/27/2017 11:53 AM, Richard Brown wrote:
Step 14 -
Sounds like we need a wiki page. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 27 February 2017 at 21:15, John Andersen <jsamyth@gmail.com> wrote:
On 02/27/2017 11:53 AM, Richard Brown wrote:
Step 14 -
Sounds like we need a wiki page.
https://en.opensuse.org/SDB:BTRFS -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Mon, 27 Feb 2017, Richard Brown wrote:
Step 1 - ... Step 2 - ... Step 3 - ... ... Step 13 - ...
If fsck printed out this useful "howto" when run against btrfs it would be very helpful and instructive for people such as me who are used to running fsck on extx, and who naïvely expect it to work for btrfs. Roger
On 2017-02-27 21:21, Roger Price wrote:
On Mon, 27 Feb 2017, Richard Brown wrote:
Step 1 - ... Step 2 - ... Step 3 - ... ... Step 13 - ...
If fsck printed out this useful "howto" when run against btrfs it would be very helpful and instructive for people such as me who are used to running fsck on extx, and who naïvely expect it to work for btrfs.
Indeed. Way to complicated for anyone to remember. And while on rescue mode, one does not have access to a wiki unless one has a second computer. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" (Minas Tirith))
On Mon, 27 Feb 2017 20:53:56 +0100 Richard Brown wrote: 8< - - - - - trimmed for brevity - - - - - >8
There, I'm very confident the above will help you Andrei, and for everyone else, can we please bury the nonsense that btrfs is lacking when it comes to repair and recovery tools?
You have scrub which is safe for day to day use, the perfectly safe usebackuproot mount option, and the various "btrfs rescue" commands which are only moderately worrying compared to the practical Russian roulette which is "btrfs check"
Thank you very much for writing this up and posting it, Richard! :) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/27/2017 03:29 PM, Carl Hartung wrote:
Thank you very much for writing this up and posting it, Richard!
Its something I never would have written up. At least if I had any responsibility for the product. I would have been half way through that list and would have realized I was documenting a turd. I'd have gone straight to management and pounded the desk till they agreed to demote it out of the default, or pulled it from the product all together. I've occasionally realized while documenting things that the product or process is absurd, and I would be ashamed to ship it. I still have pride in my work product. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 28 February 2017 at 01:44, John Andersen <jsamyth@gmail.com> wrote:
On 02/27/2017 03:29 PM, Carl Hartung wrote:
Thank you very much for writing this up and posting it, Richard!
Its something I never would have written up. At least if I had any responsibility for the product. I would have been half way through that list and would have realized I was documenting a turd.
I'd have gone straight to management and pounded the desk till they agreed to demote it out of the default, or pulled it from the product all together.
I've occasionally realized while documenting things that the product or process is absurd, and I would be ashamed to ship it. I still have pride in my work product.
And what planet are you on? We're talking about _openSUSE_ here Who is responsible for the product? The community Who is the management? There is no management As part of the Project, who and what do I hold pride in? The community and whatever they put together openSUSE is a community project, with no management. There is no one's desk to pound, there never has been. openSUSE is the sum of parts from openSUSE's hundreds of contributors. btrfs as a default was contributed by some of those contributors in 2014. No other contributor in the last 2 years has provided an alternative, compromise, or variation upon this. Now, it is safe to say that SUSE, the corporate entity who sponsors openSUSE and happens to pay the paychecks of a lot of openSUSE's contributors, happened to be the employer of the contributors in question who provided btrfs by default to openSUSE. But the fact of the matter remains that SUSE Management have no direct control over what is available (or not) in openSUSE. It requires actual developers to make actual submissions to our well documented and public processes, and they are considered, accepted, and rejected on the merits of the submissions and it's contents, nothing else. All contributors are equal in this regard, and an experienced amateur working out of their basement in the evenings has every opportunity to shape openSUSE in the way they see fit, just as a SUSE employee does as part of their work on SUSE Linux Enterprise. But for the sake of argument, lets imagine you're suggesting I should beat on SUSE's Management desk regarding btrfs as the default. What do you think the outcome would likely be? Since SUSE has adopted btrfs as the default filesystem in SUSE Linux Enterprise 12 in 2014, they have seen their revenues increase at double-digit percentages each year, earning millions of dollars as a result, and dramatically increasing the size of their workforce as a result. Large multi-million dollar contracts have been won and successfully implemented purely because SUSE ships btrfs by default. As they say, "the proof is in the pudding", and in the case of SUSE's enterprise product there are now two years of successful adoption & growth that will to counter any arguments that btrfs is a "turd" And lets for a second imagine a mythical openSUSE that actually has some desk that I could pound. We have Tumbleweed userbase numbers showing over 300% growth since going btrfs-by-default in 2014 https://speakerdeck.com/sysrich/fosdem-2017-how-i-learned-to-stop-worrying-a... And I can't show you a graph for Leap, because we currently have such a problem with our infrastructure dealing with the number of users we have on Leap that we struggle to get meaningful statistics from the systems. So if we had some sort of management, I can imagine what their opinion would be to the suggestion that we've built our distribution on the wrong default filesystem - if it was as bad as you say it is, there is no way we'd be where we are today after the last 2 years. And so, what is the point of your post? If you, or anyone else in the openSUSE community, really thinks openSUSE should do something different with regards to our default filesystem or anything else, the code is open, the tools are open, the platform is open - nothing is stopping you, get up off your ass and make submissions that address the things you see as issues. and if it was just to make noise because you have some irrational dislike of what we're doing, or just like to read your own text, stop it. This is meant to be a support mailinglist for users in need. Assist them, and keep any unproductive complaining off this list.
-- After all is said and done, more is said than done. --
Good words, as you seem to do nothing, might I strongly suggest you now try saying nothing? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 28/02/2017 à 09:24, Richard Brown a écrit :
Large multi-million dollar contracts have been won and successfully implemented purely because SUSE ships btrfs by default.
really? if so why 0.0001% of them was not put on to solve the problems we all know? do really people that sign these contracts know of what a file system is? I don't challenge that fact that BTRFS is the default, but the advantages in balance to ext4 don't seems so large, apart if somebody only trust the features list and do not look inside.
And I can't show you a graph for Leap, because we currently have such a problem with our infrastructure dealing with the number of users we have on Leap that we struggle to get meaningful statistics from the systems.
I'm personally fond of openSUSE and glad to see others are also :-). But being good, or the best don't mean to be perfect :-)
code is open, the tools are open, the platform is open - nothing is stopping you, get up off your ass and make submissions that address the things you see as issues.
you may know very well that this is not true. It's not easy to anybody to touch at the core of the distro (and it's not a bad thing)
them, and keep any unproductive complaining off this list.
I agree here. We don't need an other flame war, we had already too much in the past. that said, default install on fresh disk was broken on 42.1 as I can witness myself. I didn't had the time to try it for 42.2. Tumbleweed do not install from it's disk each time I try it (I could install by some trick) - and not that easy to find on the wiki (may be by design). The release notes should make very clear what happen of the problems of the previous version https://doc.opensuse.org/release-notes/x86_64/openSUSE/Leap/42.2/ do not say anything of the partition size necessary for BTRFS nor if the known problem from 42.1 was solved. I know that it's difficult because I had once to write part of the release notes and the writers didn't know how to get information... jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
jdd wrote:
The release notes should make very clear what happen of the problems of the previous version
https://doc.opensuse.org/release-notes/x86_64/openSUSE/Leap/42.2/
do not say anything of the partition size necessary for BTRFS nor if the known problem from 42.1 was solved. I know that it's difficult because I had once to write part of the release notes and the writers didn't know how to get information...
That has reminded me of something that happened when I was installing. I did a network install from a USB stick. And the default layout proposed installing at least one partition on the USB stick. I wasn't sure how to get around that. In the end I deleted the partitions on the USB stick and created my own layout. So maybe a default installation would not have suggested a 20G root partition. I think the partition may have been there from a previous installation anyway. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-02-28 09:24, Richard Brown wrote:
On 28 February 2017 at 01:44, John Andersen <> wrote:
On 02/27/2017 03:29 PM, Carl Hartung wrote:
Thank you very much for writing this up and posting it, Richard!
Its something I never would have written up. At least if I had any responsibility for the product. I would have been half way through that list and would have realized I was documenting a turd.
I'd have gone straight to management and pounded the desk till they agreed to demote it out of the default, or pulled it from the product all together.
I've occasionally realized while documenting things that the product or process is absurd, and I would be ashamed to ship it. I still have pride in my work product.
And what planet are you on?
We're talking about _openSUSE_ here
I'm sorry Richard, but I'm with John here. You have just convinced me to never use btrfs in a Leap computer. Not if I have to do all or part of that 14 step guide to repair a broken btrfs. And no, don't go with your community thing. It has nothing to do with all that.
And so, what is the point of your post? If you, or anyone else in the openSUSE community, really thinks openSUSE should do something different with regards to our default filesystem or anything else, the code is open, the tools are open, the platform is open - nothing is stopping you, get up off your ass and make submissions that address the things you see as issues.
Good grief! You really believe that! :-/ -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" (Minas Tirith))
On 28 February 2017 at 11:29, Carlos E. R. <robin.listas@telefonica.net> wrote:
We're talking about _openSUSE_ here
I'm sorry Richard, but I'm with John here. You have just convinced me to never use btrfs in a Leap computer. Not if I have to do all or part of that 14 step guide to repair a broken btrfs.
And no, don't go with your community thing. It has nothing to do with all that.
Sure it does - how else do you think things change in openSUSE? Magic and unicorns?
And so, what is the point of your post? If you, or anyone else in the openSUSE community, really thinks openSUSE should do something different with regards to our default filesystem or anything else, the code is open, the tools are open, the platform is open - nothing is stopping you, get up off your ass and make submissions that address the things you see as issues.
Good grief! You really believe that! :-/
I don't just believe it, I've lived it for years. The first changes I made to openSUSE were changes to defaults which I was arrogant enough to think were good enough for everyone else, most of them are still around because either no one disagreed with me enough to change it back, or upstreams ended up agreeing with me so we could drop my patches. It's how open source, and openSUSE works in reality, you can't argue with it, and if you can find examples where we didn't quite meet that aspiration my response will immediately be "and why didn't you escalate that to the board?" because the Board exists to ensure that reality is as true for as many of our contributors as possible. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* Carl Hartung <opensuse@cehartung.com> [01-01-70 12:34]:
On Mon, 27 Feb 2017 20:53:56 +0100 Richard Brown wrote: 8< - - - - - trimmed for brevity - - - - - >8
There, I'm very confident the above will help you Andrei, and for everyone else, can we please bury the nonsense that btrfs is lacking when it comes to repair and recovery tools?
You have scrub which is safe for day to day use, the perfectly safe usebackuproot mount option, and the various "btrfs rescue" commands which are only moderately worrying compared to the practical Russian roulette which is "btrfs check"
Thank you very much for writing this up and posting it, Richard! :)
just for the off-chance, just did: # btrfs fi sh / Label: none uuid: 803c8d41-1a8b-49c7-b912-431df6e9a908 Total devices 1 FS bytes used 37.19GiB devid 1 size 61.00GiB used 60.94GiB path /dev/sdc1 Tw @ 20170226 did: # btrfs fi usage /;/usr/share/btrfsmaintenance/btrfs-balance.sh;btrfs fi usage / was quite slow, but subsequentely: # btrfs fi sh / Label: none uuid: 803c8d41-1a8b-49c7-b912-431df6e9a908 Total devices 1 FS bytes used 37.19GiB devid 1 size 61.00GiB used 50.97GiB path /dev/sdc1 again happy but concerned that this happened somewhat suddenly as I had 7GiB free about a week ago and since the first install of this system some years ago (perhaps 12.1), only experienced the dreaded full filesystem once. -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri Photos: http://wahoo.no-ip.org/gallery2 Registered Linux User #207535 Photos: http://wahoo.no-ip.org/piwigo @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tuesday, 28 February 2017 14:25:10 CET Patrick Shanahan wrote:
# btrfs fi sh / Label: none uuid: 803c8d41-1a8b-49c7-b912-431df6e9a908 Total devices 1 FS bytes used 37.19GiB devid 1 size 61.00GiB used 50.97GiB path /dev/sdc1
its very odd you only have 1gb unallocated after a balance, more complete info is given by: sudo btrfs fi usage -h / -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* nicholas <ndcunliffe@gmail.com> [02-28-17 08:57]:
On Tuesday, 28 February 2017 14:25:10 CET Patrick Shanahan wrote:
# btrfs fi sh / Label: none uuid: 803c8d41-1a8b-49c7-b912-431df6e9a908 Total devices 1 FS bytes used 37.19GiB devid 1 size 61.00GiB used 50.97GiB path /dev/sdc1
its very odd you only have 1gb unallocated after a balance, more complete info is given by: sudo btrfs fi usage -h /
No, 1gb *before* balance, 10gb after. -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri Photos: http://wahoo.no-ip.org/gallery2 Registered Linux User #207535 Photos: http://wahoo.no-ip.org/piwigo @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Patrick Shanahan [28.02.2017 14:25]:
again happy but concerned that this happened somewhat suddenly as I had 7GiB free about a week ago and since the first install of this system some years ago (perhaps 12.1), only experienced the dreaded full filesystem once.
Mh, yes, don't forget snapper :-\ Unfortunately, I do not know a way to make snapper tell how many space the snapshots gobble up, but about once a month I do a snapper list and when there are too many snapshots, some are deleted manually. Just yesterday I found out that one snapshot held ~40G in .csv files that were moved to another filesystem in between, but still on disk due to their existence in a snapshot... After deleting 2 snapshots, the disk usage went from 86% to 38%. Werner -- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
28.02.2017 17:45, Werner Flamme пишет:
Unfortunately, I do not know a way to make snapper tell how many space the snapshots gobble up,
You can see how much exclusive data is held by subvolume (which snapshot is) with btrfs qgroup show This should work at least on new 42.2 installations that enable quota on btrfs by default. Unfortunately, qgroup does not show you subvolume names, you need to match group ID (those starting with 0/...) against subvolume ID from "btrfs subvolume list". Snapper wrapper around it matching size to snapshots would indeed be nice. "Exclusive data" is as best approximation to "snapshot size" as you can get. This shows you how much free space you gain when deleting subvolume^Wsnapshot. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tuesday, 28 February 2017 18:39:31 CET Andrei Borzenkov wrote:
28.02.2017 17:45, Werner Flamme пишет:
Unfortunately, I do not know a way to make snapper tell how many space the snapshots gobble up,
You can see how much exclusive data is held by subvolume (which snapshot is) with
btrfs qgroup show
This should work at least on new 42.2 installations that enable quota on btrfs by default. Unfortunately, qgroup does not show you subvolume names, you need to match group ID (those starting with 0/...) against subvolume ID from "btrfs subvolume list". Snapper wrapper around it matching size to snapshots would indeed be nice.
"Exclusive data" is as best approximation to "snapshot size" as you can get. This shows you how much free space you gain when deleting subvolume^Wsnapshot.
for named volumes and human readable size there is a script: btrfs-size.sh on https://github.com/agronick/btrfs-size/blob/master/btrfs-size.sh -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (11)
-
Andrei Borzenkov
-
Carl Hartung
-
Carlos E. R.
-
jdd
-
John Andersen
-
nicholas
-
Patrick Shanahan
-
Richard Brown
-
Richmond
-
Roger Price
-
Werner Flamme