http://bugzilla.opensuse.org/show_bug.cgi?id=1183990 Bug ID: 1183990 Summary: I think I found the cause of a kernel lock when attempting hibernation Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.2 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: carlos.e.r@opensuse.org QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Hi, For months I have been experiencing a kernel "trouble" in _some_ of my attempts to hibernate. Sometimes the hibernation would stall and not proceed. Issuing "systemctl hibernate" replied that there was one in progress, but there was no progress. I would attempt to halt the machine, but this would also stall at some point. If I pressed ctrl-alt-del several times, fast, I would see the message that it had detected the keys seven times and would halt immediately, but it did not. I had no way out but hit the power switch, and suffer the long fsck the next morning. Nothing in the logs whatsoever. Well, one day I noticed in "atop" that one of my disks went to 100% busy when this happened. So I left running another instance of gkrellm, displaying the i/o state of all my partitions in /dev/sdc, and experimenting with "sync" I noticed it was sdc9 which was active, at something like 400 Kbps. I noticed that "sync" would take sometimes a minute to complete. /dev/sdc9 is a reiserfs, and has several bind mounts: /dev/sdc9 on /data/Lareiserfs type reiserfs (rw,relatime,lazytime,user_xattr,acl) bind mounts: /dev/sdc9 on /data/homedvl type reiserfs (rw,relatime,lazytime,user_xattr,acl) /dev/sdc9 on /usr/share/flightgear type reiserfs (rw,relatime,lazytime,user_xattr,acl) /dev/sdc9 on /var/spool/news type reiserfs (rw,relatime,lazytime,user_xattr,acl) /dev/sdc9 on /home/cer/terrasync type reiserfs (rw,relatime,lazytime,user_xattr,acl) /dev/sdc9 on /usr/src type reiserfs (rw,relatime,lazytime,user_xattr,acl) Now, the directory that is active is "/var/spool/news". I use leafnode nntp proxy server. It contains 1.2 million files in about 3 GB of space. I found that if I run this sequence: time sync time sync && systemctl hibernate the machine hibernates successfully - 13 days so far, a record. Most days it takes a minute to sync, but one day, I noticed it took several minutes. Why? Well, it happened that at the same time "/usr/sbin/texpire" was working (a cronjob triggers it). This task runs for about half an hour daily expunging old posts, meaning it examines a million files. Maybe with this (tentative) report you can improve the kernel response so that it doesn't stall when trying to hibernate, assumedly when taking too long to sync. Me would think that the kernel should stop tasks before doing the sync :-? At least, running sync manually I can detect the situation and kill the busy task before suffering the crash. On the other hand, maybe there is an issue on reiserfs with "lazytime" (which is default), delaying the writes of "something" till forced to. My wild guess, it delays the timestamp that registers a file was touched. Each time I read a post, or Thunderbird scans a post, the timestamp (sorry, I don't remember which exact timestamp it is) is written, but it is not actually written but delayed "for ever". I use reiserfs for this mount because in theory it should work better than others with millions of small files. -- You are receiving this mail because: You are the assignee for the bug.