[Bug 1191929] New: NVMe SSD slow since openSUSE-LEAP-15.3 (especially with cryptsetup / dm-crypt device)
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929 Bug ID: 1191929 Summary: NVMe SSD slow since openSUSE-LEAP-15.3 (especially with cryptsetup / dm-crypt device) Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.3 Hardware: x86-64 OS: openSUSE Leap 15.3 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: duge@pre-sense.de QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- I got a system with a NVMe SSD, whose performance with dm-crypt dropped from > 200 MB/s to below 2 MB/s from openSUSE-15.2 to openSUSE-15.3. Test procedure: dd oflag=direct bs="$((64 * 1024 * 1024))" count="$((16 * 1024 / 64))" if=dev/zero of=/dev/nvme0n1 dd oflag=sync bs="$((64 * 1024 * 1024))" count="$((16 * 1024 / 64))" if=dev/zero of=/dev/nvme0n1 cryptsetup --cipher "aes-cbc-essiv:sha256" create crypt_test /dev/nvme0n1 dd oflag=direct bs="$((64 * 1024 * 1024))" count="$((16 * 1024 / 64))" if=dev/zero of=/dev/mapper/crypt_test dd oflag=sync bs="$((64 * 1024 * 1024))" count="$((16 * 1024 / 64))" if=dev/zero of=/dev/mapper/crypt_test For oflag=direct openSUSE 15.2 and 15.3 both write with over 800 MB/s to /dev/nvme0n1 and over 300 MB/s to /dev/mapper/crypt_test. openSUSE-15.2 will write with over 600 MB/s to /dev/nvme0n1 and over 200 MB/s to /dev/mapper/crypt_test openSUSE-15.3 will write with 100 MB/s to /dev/nvme0n1 and less than 2 MB/s to /dev/mapper/crypt_test. Finally an "mkfs.ext4 /dev/mapper/crypt_test" takes more than 20 minutes and "iotop" shows even less I/O than 1 MB/s! Tumbleweed-x86_64-Snapshot20211019 (Linux-5.14.11) performs like openSUSE-15.2 or even better. localhost:~ # grep PRETTY /etc/os-release PRETTY_NAME="openSUSE Leap 15.3" localhost:~ # lsblk /dev/nvme0n1 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:0 0 238,5G 0 disk localhost:~ # lspci -v -d 1344:5410 04:00.0 Non-Volatile memory controller: Micron Technology Inc Device 5410 (rev 01) (prog-if 02 [NVM Express]) Subsystem: Micron Technology Inc Device 0100 Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0 Memory at a2100000 (64-bit, non-prefetchable) [size=16K] Capabilities: [80] Express Endpoint, MSI 00 Capabilities: [d0] MSI-X: Enable+ Count=32 Masked- Capabilities: [e0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [f8] Power Management version 3 Capabilities: [100] Vendor Specific Information: ID=1556 Rev=1 Len=008 > Capabilities: [108] Latency Tolerance Reporting Capabilities: [110] L1 PM Substates Capabilities: [200] Advanced Error Reporting Capabilities: [300] #19 Kernel driver in use: nvme Kernel modules: nvme localhost:~ # smartctl -a /dev/nvme0n1 [...] === START OF INFORMATION SECTION === Model Number: Micron_2200_MTFDHBA256TCK [...] Firmware Version: P1MU003 PCI Vendor/Subsystem ID: 0x1344 IEEE OUI Identifier: 0x00a075 Controller ID: 0 Number of Namespaces: 1 Namespace 1 Size/Capacity: 256.060.514.304 [256 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 00a075 012cd592d5 [...] -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c1
--- Comment #1 from Moritz Duge
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c2
Till D�rges
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c3
Alex Mantel
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c4
Daniel Wagner
https://en.opensuse.org/openSUSE:Kernel_git
Any thoughts on this?
I didn't see any regressions, as you said it's a specific HW combination which show this problem. Highly appreciated that you try to bisect the problem. Thanks! -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c5
--- Comment #5 from Alex Mantel
(In reply to Alex Mantel from comment #3)
https://en.opensuse.org/openSUSE:Kernel_git
Any thoughts on this?
I didn't see any regressions, as you said it's a specific HW combination which show this problem. Highly appreciated that you try to bisect the problem. Thanks!
Hello Daniel, maybe i have a different understanding of the term "regression". The specific NVMe SSD worked with over ~800MB/s on OpenSUSE Leap 15.2 and now on Leap 15.3 with ~100MB/s. I'd say it's clearly a regression, since the SSD turned slow using the Leap 15.3's default kernel, as stated by @duge in comment #1. Isn't that a regression? Thanks, Alex -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c6
--- Comment #6 from Daniel Wagner
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c7
--- Comment #7 from Alex Mantel
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c8
--- Comment #8 from Moritz Duge
Hello Dominik, thanks for clarifying! I successfully bisected the commit [...]
For convenience: https://github.com/SUSE/kernel/commit/081d038020fa8cefe6d52e6b48464638c3d62f... -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c9
--- Comment #9 from Alex Mantel
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c10
Daniel Wagner
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c11
--- Comment #11 from Alex Mantel
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c12
--- Comment #12 from Daniel Wagner
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c13
--- Comment #13 from Moritz Duge
Hi Alex, Hannes should be back next week. Hopefully he is able to point to the missing upstream fix. Currently, I don't have any free cycles to investigate. Sorry.
Looks like upstream (kernel.org) simply reverted that commit a few month later. So SUSE should probably also apply that revert commit! Original upstream commit (2020-07-01): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h... Upstream revert (2021-01-27): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h... -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c14
Daniel Wagner
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c15
--- Comment #15 from Moritz Duge
[...] Sorry. I'll drop the offending patch from our kernels.
Thanks a lot for you help!
Thank you too! Can you give an estimation when the update will be available for openSUSE-LEAP-15.3 ? Alex and I are having a lot of work maintaining custom kernels for our systems. So a quick update would really help us! -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929
http://bugzilla.opensuse.org/show_bug.cgi?id=1191929#c16
--- Comment #16 from Daniel Wagner
participants (1)
-
bugzilla_noreply@suse.com