Bug ID | 1165535 |
---|---|
Summary | System lockup under memory pressure using encrypted swap on SSD (failed command: WRITE FPDMA QUEUED) |
Classification | openSUSE |
Product | openSUSE Distribution |
Version | Leap 15.1 |
Hardware | Other |
OS | Other |
Status | NEW |
Severity | Major |
Priority | P5 - None |
Component | Kernel |
Assignee | kernel-maintainers@forge.provo.novell.com |
Reporter | Ulrich.Windl@rz.uni-regensburg.de |
QA Contact | qa-bugs@suse.de |
Found By | --- |
Blocker | --- |
On a system that uses an SSD for /boot and encrypted swap only (the rest is on a traditional hard disk), I had the following problem: I was working normally with the system, when I decided to launch "yast2 online_update". Unfortunately the system became terribly slow (practically not responding for minutes), so I could not do anything than press the Reset-button after having waited about 30 minutes. After reboot I see "odd" ATA errors related to the SSD. However the SSD is rather new and in good condition. So I guess the issue has to do with memory pressure. Here are the logs when things started to become slow: Mar 03 10:09:57 pc dbus-daemon[2706]: [session uid=1000 pid=2706] Successfully activated service 'org.gnome.evince.Daemon' Mar 03 10:09:57 pc systemd[2686]: Starting Evince document viewer... Mar 03 10:09:57 pc systemd[2686]: Started Evince document viewer. Mar 03 10:10:13 pc kernel: ata2: log page 10h reported inactive tag 5 Mar 03 10:10:14 pc kernel: ata2.00: exception Emask 0x1 SAct 0x7fffffdf SErr 0x0 action 0x0 Mar 03 10:10:14 pc kernel: ata2.00: irq_stat 0x40000009 Mar 03 10:10:14 pc kernel: ata2.00: failed command: WRITE FPDMA QUEUED Mar 03 10:10:14 pc kernel: ata2.00: cmd 61/08:00:e0:b0:7d/00:00:00:00:00/40 tag 0 ncq dma 4096 out res 40/00:00:e0:b0:7d/00:00:00:00:00/40 Emask 0x1 (device error) Mar 03 10:10:14 pc kernel: ata2.00: status: { DRDY } Mar 03 10:10:14 pc kernel: ata2.00: failed command: WRITE FPDMA QUEUED Mar 03 10:10:14 pc kernel: ata2.00: cmd 61/08:08:e8:b0:7d/00:00:00:00:00/40 tag 1 ncq dma 4096 out res 40/00:00:e0:b0:7d/00:00:00:00:00/40 Emask 0x1 (device error) Mar 03 10:10:14 pc kernel: ata2.00: status: { DRDY } Mar 03 10:10:14 pc kernel: ata2.00: failed command: WRITE FPDMA QUEUED Mar 03 10:10:14 pc kernel: ata2.00: cmd 61/08:10:f0:b0:7d/00:00:00:00:00/40 tag 2 ncq dma 4096 out res 40/00:00:e0:b0:7d/00:00:00:00:00/40 Emask 0x1 (device error) Mar 03 10:10:14 pc kernel: ata2.00: status: { DRDY } [...] Mar 03 10:10:14 pc kernel: ata2.00: failed command: WRITE FPDMA QUEUED Mar 03 10:10:14 pc kernel: ata2.00: cmd 61/08:f0:d8:b0:7d/00:00:00:00:00/40 tag 30 ncq dma 4096 out res 40/00:00:e0:b0:7d/00:00:00:00:00/40 Emask 0x1 (device error) Mar 03 10:10:14 pc kernel: ata2.00: status: { DRDY } Mar 03 10:10:14 pc kernel: ata2.00: supports DRM functions and may not be fully accessible Mar 03 10:10:14 pc kernel: ata2.00: supports DRM functions and may not be fully accessible Mar 03 10:10:14 pc kernel: ata2.00: configured for UDMA/133 Mar 03 10:10:14 pc kernel: ata2: EH complete Mar 03 10:10:14 pc kernel: ata2.00: Enabling discard_zeroes_data Mar 03 10:10:14 pc rsyslogd[1332]: message repeated 13 times: [-- MARK --] Mar 03 10:10:14 pc rsyslogd[1332]: action 'action 1' suspended (module 'builtin:ompipe'), retry 0. There should be messages before this one giving the reason for suspension. [v8.33.1 try http://www.rsyslog.com/e/2007 ] Mar 03 10:10:14 pc rsyslogd[1332]: action 'action 1' suspended (module 'builtin:ompipe'), next retry is Tue Mar 3 10:10:44 2020, retry nbr 0. There should be messages before this one giving the reason for suspension. [v8.33.1 try http://www.rsyslog.com/e/2007 ] [...] Mar 03 10:15:48 pc /usr/lib/gdm/gdm-x-session[2700]: (II) event2 - Logitech USB-PS/2 Optical Mouse: SYN_DROPPED event - some input events have been lost. Mar 03 10:16:17 pc /usr/lib/gdm/gdm-x-session[2700]: (II) event2 - Logitech USB-PS/2 Optical Mouse: SYN_DROPPED event - some input events have been lost. [...] Mar 03 10:25:13 pc systemd-journald[30532]: Journal started Mar 03 10:25:18 pc systemd-journald[30532]: System journal (/var/log/journal/89c660865c00403a9bacef32b6828556) is 856.0M, max 819.2M, 0B free. [...] Mar 03 10:25:19 pc kernel: ata2.00: exception Emask 0x50 SAct 0x7f000007 SErr 0x800 action 0x6 frozen Mar 03 10:25:19 pc kernel: ata2.00: irq_stat 0x08000000, interface fatal error Mar 03 10:25:19 pc kernel: ata2: SError: { HostInt } Mar 03 10:25:19 pc kernel: ata2.00: failed command: WRITE FPDMA QUEUED Mar 03 10:25:19 pc kernel: ata2.00: cmd 61/c8:00:38:06:78/01:00:00:00:00/40 tag 0 ncq dma 233472 out res 40/00:c0:00:00:72/00:00:00:00:00/40 Emask 0x50 (ATA bus error) Mar 03 10:25:19 pc kernel: ata2.00: status: { DRDY } [...] Mar 03 10:25:19 pc kernel: ata2.00: failed command: WRITE FPDMA QUEUED Mar 03 10:25:19 pc kernel: ata2.00: cmd 61/40:f0:f8:00:78/05:00:00:00:00/40 tag 30 ncq dma 688128 out res 40/00:c0:00:00:72/00:00:00:00:00/40 Emask 0x50 (ATA bus error) Mar 03 10:25:19 pc kernel: ata2.00: status: { DRDY } Mar 03 10:25:19 pc kernel: ata2: hard resetting link Mar 03 10:25:19 pc kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Mar 03 10:25:19 pc kernel: ata2.00: supports DRM functions and may not be fully accessible Mar 03 10:25:19 pc kernel: ata2.00: supports DRM functions and may not be fully accessible Mar 03 10:25:19 pc kernel: ata2.00: configured for UDMA/133 Mar 03 10:25:19 pc kernel: ata2: EH complete Mar 03 10:25:19 pc kernel: ata2.00: Enabling discard_zeroes_data [...] Mar 03 10:25:19 pc systemd-coredump[30527]: Process 28448 (systemd-journal) of user 0 dumped core. Mar 03 10:25:19 pc systemd-coredump[30527]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.92bd8e3e91344ef998c2c08ade695cd2.28448.1583227313000000.lz4 Mar 03 10:25:19 pc systemd-coredump[30527]: Stack trace of thread 28448: Mar 03 10:25:19 pc systemd-coredump[30527]: #0 0x00007f52360114f5 journal_file_move_to_object (libsystemd-shared-234.so) Mar 03 10:25:19 pc systemd-coredump[30527]: #1 0x00007f5236012f00 n/a (libsystemd-shared-234.so) Mar 03 10:25:19 pc systemd-coredump[30527]: #2 0x00007f5236013548 n/a (libsystemd-shared-234.so) Mar 03 10:25:19 pc systemd-coredump[30527]: #3 0x00007f523601470d journal_file_append_entry (libsystemd-shared-234.so) [...] Mar 03 10:25:54 pc systemd-journald[30532]: Journal stopped Mar 03 10:26:38 pc systemd-journald[30532]: Received SIGTERM from PID 1 (systemd). Mar 03 10:26:38 pc systemd-journald[30545]: Journal started Mar 03 10:26:38 pc systemd-journald[30545]: System journal (/var/log/journal/89c660865c00403a9bacef32b6828556) is 856.0M, max 819.2M, 0B free. [...] >From this point on I did not see any more ATA errors, but the system was still unusable, so I hit Reset: Mar 03 10:55:58 pc kernel: SFW2-INext-DROP-DEFLT IN=eth0 OUT= MAC=00:22:11:dd:00:11:a0:36:9f:b2:3f:25:08:00 SRC=1.9.154.170 1.9.5.2 LEN=44 TOS=0x00 PREC=0x00 TTL=5 6 ID=52696 PROTO=TCP SPT=47944 DPT=655 WINDOW=1024 RES=0x00 SYN URGP=0 OPT (020405B4) -- Reboot -- Mar 03 11:05:24 linux-abcd kernel: Linux version 4.12.14-lp151.28.36-default (geeko@buildhost) (gcc version 7.4.1 20190905 [gcc-7-branch revision 275407] (SUSE Linux) ) #1 SMP Fri Dec 6 13:50:27 UTC 2019 (8f4a495) Final note: The IP address and related identifyable information in the log has been anonymized for privacy.