[Bug 1111414] New: snapper's background comparison makes the system unusable
http://bugzilla.suse.com/show_bug.cgi?id=1111414 Bug ID: 1111414 Summary: snapper's background comparison makes the system unusable Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: YaST2 Assignee: yast2-maintainers@suse.de Reporter: fvogt@suse.com QA Contact: jsrain@suse.com Found By: --- Blocker: --- Multiple times the system became slowed down by snapper which used up most of the IO capacity of the hard drive and made the system unusable. strace showed that it was busy diffing kernel modules and sources. This could be improved massively by using the output of btrfs send to compare snapshots. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 Fabian Vogt <fvogt@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |iforster@suse.com, | |rbrown@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c1 Steffen Winterfeldt <snwint@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |aschnell@suse.com Flags| |needinfo?(aschnell@suse.com | |) --- Comment #1 from Steffen Winterfeldt <snwint@suse.com> --- Arvin, could it? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c2 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(aschnell@suse.com | |) | --- Comment #2 from Arvin Schnell <aschnell@suse.com> --- Snapper does already use btrfs send. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c3 --- Comment #3 from Fabian Vogt <fvogt@suse.com> --- (In reply to Arvin Schnell from comment #2)
Snapper does already use btrfs send.
Then I wonder why snapper read the files from the snapshot and compared them manually. I'm not sure why snapper does this, is it just to determine whether a pre-post pair is empty and should be cleaned up? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c4 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|YaST2 |Kernel Assignee|yast2-maintainers@suse.de |kernel-maintainers@forge.pr | |ovo.novell.com QA Contact|jsrain@suse.com |qa-bugs@suse.de --- Comment #4 from Stefan Hundhammer <shundhammer@suse.com> --- I am pretty sure that it's the Btrfs subsystem in the kernel that causes this load; nothing that we can fix on the Snapper side (which is NOT bug component YaST to begin with). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tiwai@suse.com Assignee|kernel-maintainers@forge.pr |jeffm@suse.com |ovo.novell.com | -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c5 --- Comment #5 from Jeff Mahoney <jeffm@suse.com> --- It's premature to call this a btrfs issue. The reporter claims that it was doing a manual comparison. Comment #2 just dismisses this. This needs to be reconciled and the workload needs to be described before we'll look at this as a file system issue. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c6 --- Comment #6 from Arvin Schnell <aschnell@suse.com> --- If btrfs send fails then snapper uses manual comparison. The code in snapper was added more than five years ago - maybe it does not work any longer due to btrfs API changes (there is already conditional code depending on the version of libbtrfs). /var/log/snapper.log shows what is going on. Anyway, even manual comparison should not makes the system unusable. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c7 --- Comment #7 from Arvin Schnell <aschnell@suse.com> --- Funny, just now someone reports that the btrfs send used in snapper does not work: https://github.com/openSUSE/snapper/pull/438 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c8 Jeff Mahoney <jeffm@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dsterba@suse.com, | |jeffm@suse.com Assignee|jeffm@suse.com |aschnell@suse.com --- Comment #8 from Jeff Mahoney <jeffm@suse.com> --- That looks like a pretty good candidate. If this issue still exists with that patch applied, I'll dig a little deeper. CC'ing Dave for the heads up of how this ultimately was a user-visible change that broke userspace. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P2 - High URL| |https://trello.com/c/LzTx1c | |2d/2609-tw-1111414 Assignee|aschnell@suse.com |yast-internal@suse.de -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c9 --- Comment #9 from Arvin Schnell <aschnell@suse.com> --- I added a card to the YaST trello task board so that the issue is prioritised with the other tasks. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c10 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CONFIRMED Flags| |needinfo?(dsterba@suse.com) --- Comment #10 from Arvin Schnell <aschnell@suse.com> --- I had the chance to look at the issue now. First thing to notice is that SLE15 is also affected while SLE12 SP4 is not affected. AFAIS the change leading to the problem is in libbtrfs (4.11), but not even the patchlevel of libbtrfs was increased. On the other hand the patch for snapper should also work with the older libbtrfs. David, can you please confirm this? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c11 --- Comment #11 from Fabian Vogt <fvogt@suse.com> --- (In reply to Arvin Schnell from comment #10)
I had the chance to look at the issue now.
First thing to notice is that SLE15 is also affected while SLE12 SP4 is not affected.
AFAIS the change leading to the problem is in libbtrfs (4.11), but not even the patchlevel of libbtrfs was increased. On the other hand the patch for snapper should also work with the older libbtrfs.
AFAICT that's also what the comment on the PR (https://github.com/openSUSE/snapper/pull/438#issuecomment-429358392) says. So is there still a reason to not merge it?
David, can you please confirm this?
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c12 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Depends on| |1049574 --- Comment #12 from Arvin Schnell <aschnell@suse.com> --- Well, for once an confirmation would be fine. Also, as I already wrote on github, there is the possibility of regressions by enabling it now. After all this was never tested in the SLE15 code base (and MUs are requested for SLE15). So real testing is needed. That this idea is not just an abstract idea shows bug #1049574, which from my point of view must be fixed first. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 Jeff Mahoney <jeffm@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |nborisov@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|CONFIRMED |IN_PROGRESS -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c13 --- Comment #13 from Arvin Schnell <aschnell@suse.com> --- PR: https://github.com/openSUSE/snapper/pull/472 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c14 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED --- Comment #14 from Arvin Schnell <aschnell@suse.com> --- SR for openSUSE-Factory: https://build.opensuse.org/request/show/667835 Closing as fixed. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c15 --- Comment #15 from Arvin Schnell <aschnell@suse.com> --- SR for SLE15-SP1: https://build.suse.de/request/show/182327 MU for SLE15: https://build.suse.de/request/show/182435 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c16 --- Comment #16 from Swamp Workflow Management <swamp@suse.de> --- SUSE-RU-2019:0255-1: An update that has two recommended fixes can now be installed. Category: recommended (moderate) Bug References: 1049574,1111414 CVE References: Sources used: SUSE Linux Enterprise Module for Basesystem 15 (src): snapper-0.5.6-5.7.1 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1111414 http://bugzilla.suse.com/show_bug.cgi?id=1111414#c17 --- Comment #17 from Swamp Workflow Management <swamp@suse.de> --- openSUSE-RU-2019:0187-1: An update that has two recommended fixes can now be installed. Category: recommended (moderate) Bug References: 1049574,1111414 CVE References: Sources used: openSUSE Leap 15.0 (src): snapper-0.5.6-lp150.3.9.1 -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com