[Bug 879792] New: Hibernate freezes the PC when saving image data page
https://bugzilla.novell.com/show_bug.cgi?id=879792 https://bugzilla.novell.com/show_bug.cgi?id=879792#c0 Summary: Hibernate freezes the PC when saving image data page Classification: openSUSE Product: openSUSE 13.1 Version: Final Platform: i686 OS/Version: openSUSE 13.1 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: aosalazari@gmail.com QAContact: qa-bugs@suse.de Found By: --- Blocker: --- Created an attachment (id=591928) --> (http://bugzilla.novell.com/attachment.cgi?id=591928) The relevant section of the messages file and pm-suspend.log User-Agent: Mozilla/5.0 (X11; Linux i686; rv:29.0) Gecko/20100101 Firefox/29.0 Since upgrading the kernel to 3.11.10-11, hibernate causes the PC freeze forcing a power switch reset. The behaviour is consistent. Working kernel: 3.11.10-7 Failing Kernel: 3.11.10-11 the problem is consistent and easily reproducible. Reproducible: Always Steps to Reproduce: 1. Just have the KDE desktop on and no other applications 2. Select Leave-Hibernate 3. PC freezes Actual Results: s2disk: System snapshot ready. Preparing to write s2disk: Image size: 423188 kilobytes s2disk: Free swap 2103284 kilobytes s2disk: Saving 105872 image data pages (press backspace to abort)...(0%) <FREEZE> -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c1
Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c2
Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c3
Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c4
Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c5
--- Comment #5 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c6
--- Comment #6 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c7
--- Comment #7 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c8
--- Comment #8 from Takashi Iwai
results of the alternative methods for hibernating:
1. echo disk > /sys/power/state No messages displayed on console, upon restarting the grub menu is shown, restores to last state
2. powersave -U Messages are displayed on console, no grub menu or login prompt, restores last state
3. pm-hibernate behaves the same as powersave -U
So, all these three manual methods did resume after S4 successfully?
Additionally, I enabled kdump and that causes hibernate to work as expected. Disabling kdump reverts to the buggy state.
Do you mean S4 gets broken with the methods above once when kdump is disabled? Or, is it only about S4 from KDE menu? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c9
--- Comment #9 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c10
Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c11
--- Comment #11 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c12
--- Comment #12 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c13
--- Comment #13 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c14
--- Comment #14 from Takashi Iwai
I have re-run the tests with the addition of using the
command. the results are the same
Do you mean that 3.11.10-7 worked and 3.11.10-11 was broken?
and the results of <uname> are:
[quote] linux-ecrt-aos:/home/adolfo # uname -a Linux linux-ecrt-aos 3.11.10-11-desktop #1 SMP PREEMPT Mon May 12 13:37:06 UTC 2014 (3d22b5f) i686 i686 i386 GNU/Linux linux-ecrt-aos:/home/adolfo #
linux-ecrt-aos:/home/adolfo # uname -a Linux linux-ecrt-aos 3.11.10-7-desktop #1 SMP PREEMPT Mon Feb 3 09:41:24 UTC 2014 (750023e) i686 i686 i386 GNU/Linux linux-ecrt-aos:/home/adolfo # [/quote]
Are there any other tests I can do to ensure that there was a regression of kernel?
Nothing can ensure 100%. But if booting with one kernel always works and another always fails for a few times tests, it's mostly sure.
BTW how can I submit multiple file attachments?
Use tar or zip. Or just attach multiple times. Now how to proceed. Since it's apparently a regression between two 3.11.x.y kernels, and the bug is easily reproducible, the best would be to find out the regression by git bisect. If you need to know how to build a kernel and perform git bisect, I can give you a brief instruction. The expanded git tree is available at git@gitorious.org:opensuse/kernel.git and the bisection range is b73e414b..1cb3e62f. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c15
--- Comment #15 from Adolfo Salazar
(In reply to comment #12)
I have re-run the tests with the addition of using the
command. the results are the same Do you mean that 3.11.10-7 worked and 3.11.10-11 was broken?
Yes that is correct
and the results of <uname> are:
[quote] linux-ecrt-aos:/home/adolfo # uname -a Linux linux-ecrt-aos 3.11.10-11-desktop #1 SMP PREEMPT Mon May 12 13:37:06 UTC 2014 (3d22b5f) i686 i686 i386 GNU/Linux linux-ecrt-aos:/home/adolfo #
linux-ecrt-aos:/home/adolfo # uname -a Linux linux-ecrt-aos 3.11.10-7-desktop #1 SMP PREEMPT Mon Feb 3 09:41:24 UTC 2014 (750023e) i686 i686 i386 GNU/Linux linux-ecrt-aos:/home/adolfo # [/quote]
Are there any other tests I can do to ensure that there was a regression of kernel?
Nothing can ensure 100%. But if booting with one kernel always works and another always fails for a few times tests, it's mostly sure.
BTW how can I submit multiple file attachments?
Use tar or zip. Or just attach multiple times.
Doh. Should have thought of that...
Now how to proceed. Since it's apparently a regression between two 3.11.x.y kernels, and the bug is easily reproducible, the best would be to find out the regression by git bisect.
If you need to know how to build a kernel and perform git bisect, I can give you a brief instruction. The expanded git tree is available at git@gitorious.org:opensuse/kernel.git and the bisection range is b73e414b..1cb3e62f.
I have never used git bisect before so any pointers will be useful. I finally got hold of a DB9 null modem cable so I can capture the kernel messages through the serial port. I followed the instructions listed in the Bugreport kernel page ( http://en.opensuse.org/openSUSE:Bugreport_kernel ) but it did not work although minicom/putty do. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c16
--- Comment #16 from Takashi Iwai
Now how to proceed. Since it's apparently a regression between two 3.11.x.y kernels, and the bug is easily reproducible, the best would be to find out the regression by git bisect.
If you need to know how to build a kernel and perform git bisect, I can give you a brief instruction. The expanded git tree is available at git@gitorious.org:opensuse/kernel.git and the bisection range is b73e414b..1cb3e62f.
I have never used git bisect before so any pointers will be useful.
The first thing to do is to build your kernel from git tree. 1. Clone the git tree. Suppose you'll expand the tree under $HOME/git % mkdir $HOME/git % cd $HOME/git % git clone git@gitorious.org:opensuse/kernel.git 2. Switch to openSUSE-13.1 branch % cd kernel % git checkout openSUSE-13.1 3. Go back to the good working commit % git reset --hard b73e414b 4. Set up configuration; below will give you the minimal config that is similar to the running kernel. % make localmodconfig 5. Edit .config file; it'd be better to change CONFIG_LOCALVERSION to a unique one like "-test". Also, better to backup .config file as a different name at this stage. 6. Run make; you can build parallel by -j 4 or so (4 is the number of CPUs) 7. If everything is OK, you'll have a kernel in arch/x86/boot/bzImage. Run "make install" as root. "make modules_install" will install only the modules. Run "mkinitrd" if necessary, too. 8. Reboot with your kernel. If it doesn't appear in GRUB menu, check the grub configuration. There is a helper script /sbin/update-bootloader. Pass --image and --initrd with the new vmlinuz and initrd paths. 9. Test S4. With this kernel, S4 must work. 10. After confirming the successful S4, try the newer kernel. % git reset --hard 1cb3e62f Then build again. You can just run "make -j4". 11. Install and test this kernel (you can override the old one). Confirm that S4 is broken with this. OK, now we can start bisection. 12. Run the following: % git bisect start % git bisect bad 1cb3e62f % git bisect good b73e414b This will checkout the commit to test automatically. Build this, install and test the kernel. 13. If S4 is still broken with the current kernel, give "git bisect bad". OTOH, if S4 works with the current kernel, give "git bisect good". Then git will give you the next commit to test. Repeat this until you reach to the regression commit. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c17
--- Comment #17 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c18
--- Comment #18 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c19
--- Comment #19 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c20
Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c21
--- Comment #21 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c22
Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c23
--- Comment #23 from Michal Hocko
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c24
--- Comment #24 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c25
--- Comment #25 from Michal Hocko
I had difficulty getting the snapshots as these files read that they were zero bytes long. Here is the out put as requested:
How did you try to get the snapshot? Something like the following should work even if snapshot fails. mkdir logs cd logs cp /proc/meminfo meminfo.`date +%s` cp /proc/vmstat vmstat.`date +%s` sync sudo s2disk # or whatever command you are using to start hibernation.
Linux linux-ecrt-aos 3.11.10-11-desktop #1 SMP PREEMPT Mon May 12 13:37:06 UTC 2014 (3d22b5f) i686 i686 i386 GNU/Linux
OK, this is an interesting information. I have missed that in the bugzilla header. This is a 32b system so I expect something went wrong with the dirty balancing for highmem systems. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c26
--- Comment #26 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c27
--- Comment #27 from Michal Hocko
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c28
--- Comment #28 from Michal Hocko
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c29
--- Comment #29 from Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c30
--- Comment #30 from Michal Hocko
So, removing the redundant min_free_kbytes by the upstream patch you suggested would fix the problem, but it's still fairly tight in comparison with the former kernels. Before the commit [a1c3bfb2: mm/page-writeback.c: do not count anon pages as dirtyable memory], it took anon pages into account. And there are about 80000 anon pages available at the time of image preallocation.
That is true but adding anon pages to the picture just papers over the inherent problem here. There doesn't make any sense to throttle the hibernation process. This is handled with the in-kernel hibernation because the kernel thread is most probably marked to be not throttled (I haven't checked but that would make a lot of sense). Anyway, let's wait for Adolfo to confirm your observations. Adolfo, could you test with the attached patch, please? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c31
--- Comment #31 from Adolfo Salazar
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c32
--- Comment #32 from Michal Hocko
I have tested the system 3 times with the patched kernel and hibernate works as expected.
Great!
Micha I would ask one favour from you - it seems that you have overestimated my capabilities a little bit. I haven't coded for about 14 years and that was with specialist embedded systems, so my knowledge of linux and relevant tools are quite basic. So please can you give me detailed instructions as to what you need me to do so at least I feel a little bit secure that I am doing things the way you want me to do them. Having said that I had forgotten how much fun this is. let me know what you need me to do next.
Well, your part is done. Keep running your patched kernel until a new maintenance update is released. It will contain the fix + some more which will be part of that update as well.
PS I did learn how to incorporate the patch using git so I am very confident that the process is fine.
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c33
Michal Hocko
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c34
Takashi Iwai
https://bugzilla.novell.com/show_bug.cgi?id=879792
https://bugzilla.novell.com/show_bug.cgi?id=879792#c35
--- Comment #35 from Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=879792
Swamp Workflow Management
participants (1)
-
bugzilla_noreply@novell.com