[Bug 1099769] New: btrfs balance generates 100% CPU usage for long periods (15min's +)
http://bugzilla.opensuse.org/show_bug.cgi?id=1099769 Bug ID: 1099769 Summary: btrfs balance generates 100% CPU usage for long periods (15min's +) Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: SUSE Other Status: NEW Severity: Major Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: chris@kuta.bid QA Contact: qa-bugs@suse.de Found By: Community User Blocker: --- System becomes unresponsive for long periods (15 minutes or more). 'top' command indicates btrfs process using 100% CPU. Previously reported bug is still not fixed IMHO. * Please advise which logs would be helpful & how to generate on OpenSUSE Tumbleweed. Recently installed & updated OpenSUSE Tumbleweed. KDE desktop. Kernel 4.17.2-1-default. Regards -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099769 Chris . <chris@kuta.bid> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P2 - High CC| |chris@kuta.bid -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099769 http://bugzilla.opensuse.org/show_bug.cgi?id=1099769#c1 --- Comment #1 from Chris . <chris@kuta.bid> --- * Affected systems are cold booted daily. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099769 http://bugzilla.opensuse.org/show_bug.cgi?id=1099769#c2 Adam Szyszko <szyszko.86@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CONFIRMED CC| |szyszko.86@gmail.com --- Comment #2 from Adam Szyszko <szyszko.86@gmail.com> --- I can confirm this bug. It happened to me this morning also. btrfs and btrfs-transacti eats 100% CPU for about few minutes. My lap is unusable during this process. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099769 Pavel Artemyev <a@pavel.bz> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |a@pavel.bz -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099769 http://bugzilla.opensuse.org/show_bug.cgi?id=1099769#c5 Bruno Friedmann <bruno@ioda-net.ch> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bruno@ioda-net.ch --- Comment #5 from Bruno Friedmann <bruno@ioda-net.ch> --- With tumbleweed snapshot 2018116 kernel 4.19.1 Hardware is Laptop Dell Precision 7510 with Xeon CPU E3-1535M v5 uefi boot 64GB Ram 2400 DDR4 Primary storage is Toshiba NVMe 1024GB (gpt) p1 : 164MB vfat16 (efi) p2 : 943GB luks encrypted / btrfs p3 : 2GB luks encrypted swap Grub is used to decrypted the root btrfs partition I discover this morning a btrfs-transacti running at 100% (started by the btrfs balance timer during the night) After a reboot, the system is now unsable with error due to timeout on mounting the btrfs root filesystem. booting a rescue system (same tw snapshot) and trying to mount the fs manually result in same process btrfs-transacti eating 100% of cpu. tonight Mount finally didn't success after waiting more than 45 minutes. When the btrfs partition was last mounted (even with usebackuproot) there was btrfs balance running on Trying to run a btrfs balance cancel /mnt lead to several backtrace in the kernel log (see attached dmesg captured if one day I can get them out of the broken fs) It seems that there's not enough place on the partition even if df was telling 163GB free over the 943GB total of the partition.
From memory as it is not possible to mount or grab information from the defect volume.
BTW ; What's written in the wiki https://en.opensuse.org/SDB:BTRFS btrfs scrub start /dev/mapper/cr_nvme0n1p2 doesn't work it state error not mounted filesystem ? What's the problem as it should normally possible to use a device (here the cryptsetup luksOpen /dev/nvmen0p2 cr_nvme0n1p2 unlocked device) It's the first time since one year the system is installed, that this happen. The only things that have changed, is a lot of newer data in (up to 85% used) and numerous file and directories cleanup yesterday afternoon. What is the best possible recovery scenario ? and the best forensic approach to debug this nasty behaviour. any tricks is welcomed. In the meantime I will at least retry to mount it ro and use a btrfs dump. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099769 Alan Prescott <alanjprescott@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |alanjprescott@gmail.com -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com