[opensuse-kernel] call to ioctl() does not return in s2disk
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I have been investigating why s2disk crashes (halts) randomly when attempting to hibernate my desktop machine. This has been working well for years, but since about 2..3 months, it fails. The symptom is that s2disk halts after printing this message: Snapshotting system there is no disk activity, nothing. I have to power cycle. What I have done is hack the suspend_system() function in suspend.c, adding messages that get printed to the screen so that I can see where it gets stuck. After doing the modification, system refused to crash for a week, till a day ago: finally, I got a crash after the "step 1.2" message: sprintf(message, "Snapshotting system - step 1.1"); printf("%s: %s\n", my_name, message); if (set_image_size(snapshot_fd, image_size)) { error = errno; break; } sprintf(message, "atomic_snapshot - step 1.2"); printf("%s: %s\n", my_name, message); <=========== if (atomic_snapshot(snapshot_fd, &in_suspend)) { error = errno; break; } So, it appears that atomic_snapshot() fails, and fails to report, and kernel crashes silently. So I added printf statementst there: static int atomic_snapshot(int dev, int *in_suspend) { int error; printf("Atomic_snapshot: 1"); error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend); printf(", 2"); if (error && errno == ENOTTY) { printf(", 3 (err)"); report_unsupported_ioctl("SNAPSHOT_CREATE_IMAGE"); error = ioctl(dev, SNAPSHOT_ATOMIC_SNAPSHOT, in_suspend); } printf(", 4\n"); return error; } I did two successful hibernations, and it failed on the third, printing only: Atomic_snapshot: 1 Thus it is the ioctl() function call that is not returning! Now I need some dev to look at it further. I'm no kernel developper, I'm stuck here. You can have a look at the report: <https://bugzilla.novell.com/show_bug.cgi?id=765084> - -- Cheers / Saludos, Carlos E. R. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/tuxUACgkQtTMYHG2NR9XuRQCfTXhgb97pjgfyx92Rx8twH+C1 LPsAn3n8ZkCDOERiYROqpNyyH+vD9c6v =Mfzk -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 6/29/12 10:26 AM, Carlos E. R. wrote:
Hi,
I have been investigating why s2disk crashes (halts) randomly when attempting to hibernate my desktop machine. This has been working well for years, but since about 2..3 months, it fails.
The symptom is that s2disk halts after printing this message:
Snapshotting system
there is no disk activity, nothing. I have to power cycle.
What I have done is hack the suspend_system() function in suspend.c, adding messages that get printed to the screen so that I can see where it gets stuck. After doing the modification, system refused to crash for a week, till a day ago: finally, I got a crash after the "step 1.2" message:
sprintf(message, "Snapshotting system - step 1.1"); printf("%s: %s\n", my_name, message); if (set_image_size(snapshot_fd, image_size)) { error = errno; break; } sprintf(message, "atomic_snapshot - step 1.2"); printf("%s: %s\n", my_name, message); <=========== if (atomic_snapshot(snapshot_fd, &in_suspend)) { error = errno; break; }
So, it appears that atomic_snapshot() fails, and fails to report, and kernel crashes silently. So I added printf statementst there:
static int atomic_snapshot(int dev, int *in_suspend) { int error; printf("Atomic_snapshot: 1");
error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend); printf(", 2"); if (error && errno == ENOTTY) { printf(", 3 (err)"); report_unsupported_ioctl("SNAPSHOT_CREATE_IMAGE"); error = ioctl(dev, SNAPSHOT_ATOMIC_SNAPSHOT, in_suspend); } printf(", 4\n"); return error; }
I did two successful hibernations, and it failed on the third, printing only:
Atomic_snapshot: 1
Thus it is the ioctl() function call that is not returning!
Yep, this is where it would have to fail. That's the kernel entry point to create the snapshot. There is a ton of heavy lifting that goes along with it behind the scenes in kernel-space. Unfortunately, the now-common way of adding events (tracepoints) doesn't really help here since there'd be no way to dump them. Historically, most suspend problems are driver issues. If you have some time, try unloading individual modules before attempting the suspend. Once you're able to suspend reliably again, it should be clear which module is at fault. I'd start with the usb audio driver. - -Jeff
Now I need some dev to look at it further. I'm no kernel developper, I'm stuck here.
You can have a look at the report:
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJP+vBAAAoJEB57S2MheeWyvicQAInXiAj+I1Uh+7WWZldgF7GL m1LHmAdj1YoCWiFsy8vSdhTsb7OednP5f0k2T8KVC1ZGEZ2hC7dzf8vBeKtxLhDf 5+BsL2N+PrUxDHeuw0X9scyAcn3ODfLRoy3X74qx+9BvWk1C+fPMRHky503H/5OX F5kkyIPusmfStbtmoF6BcYGOba0AOKnZv2nR2qpVCuAKtVEDpNA44T0TdxKWsart 6wlmP22MmO6XcgBxtjs1LPfoKG2lV/3U/w0BnPGxSijFttEn6E+c+jqtn2AIxBNH jn5/H9BezUZOxj9f2fP3zko8G/YQatTHKGX5PnZVr7rY92r3xpcxDomih0qt4f4c RCN0+sHQt4dNjDblbak1qzke6y9ORqYrun0RI8yLbLNpN7ovbzT/UpVah7dLkQiK kOVQ7e8iZaPDDkPwpuy65JqduFBL2Rz7xaO0VlhTtQKCHvwz2Y0ypoYdGvrnTmSN xhnuO3r3HRdxJFL0a/Y6+yw170DCz2Kcz156T23MaP+Mu6JuuM3dogBLJgCIOLZ0 Ym8vnQPVc3hQdpzcKvEnZPceIsF/Muz7LlYyuQGfh+g6cTwjTm44XDKXkx0Hv+ho d5YPnrdRdt4YWuJ/4n3i5fKUePve6dRTwj15X7rCIDbcrP2LeCiLDhTcX8kvPOEd QXiMQLtqCx+NN7l7gemG =SXO3 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday, 2012-07-09 at 10:52 -0400, Jeff Mahoney wrote:
On 6/29/12 10:26 AM, Carlos E. R. wrote:
Thus it is the ioctl() function call that is not returning!
Yep, this is where it would have to fail. That's the kernel entry point to create the snapshot. There is a ton of heavy lifting that goes along with it behind the scenes in kernel-space.
Unfortunately, the now-common way of adding events (tracepoints) doesn't really help here since there'd be no way to dump them.
Well, since I posted that I made some progress. As a matter of fact, I'm now running a modified kernel with lots of printk statements "savagely" sprinkled around :-) Last crash was about here: /usr/src/linux/kernel/power/hibernate.c: int hibernation_snapshot(int platform_mode) { ... printk(KERN_INFO "I was here: 2.4\n"); error = dpm_suspend_start(PMSG_FREEZE); <====== if (error) goto Recover_platform; printk(KERN_INFO "I was here: 2.5\n"); It printed the 2.4 mark, then several messages about the disk system (no errors), and the last message was like this (this one is from another run): .... Telcontar kernel - - - [ 792.002714] r8169 0000:06:00.0: eth0: link up I made a video of the crash messages, but they are almost unreadable; I deduced the messages by comparison with a sucesful hibernate cycle. So I then added more messages to dpm_suspend_start() and I'm waiting for it to crash and take a photo instead: it could happen tonight or in a week. The process I'm following is reported in the bugzilla: <https://bugzilla.novell.com/show_bug.cgi?id=765084>, and in this mail thread: <http://lists.opensuse.org/opensuse/2012-06/msg00912.html> <http://lists.opensuse.org/opensuse/2012-07/msg00000.html> Last events are here: <http://lists.opensuse.org/opensuse/2012-07/msg00132.html> and the next 3 messages.
Historically, most suspend problems are driver issues. If you have some time, try unloading individual modules before attempting the suspend. Once you're able to suspend reliably again, it should be clear which module is at fault. I'd start with the usb audio driver.
Ah. Time I have... I have to do something, it is crashing randomly when I hibernate. It is a nuisance, hibernation is an important part of my work methodology. Then I would do "rmmod snd_usb ; rmmod snd_usb_audio", hibernate, and on thawing I would have to modprobe them. Or perhaps "rcalsa stop"? Or seeing the last message before the crash, should I try another driver instead, perhaps the network? Or, if you have suggestions about more printk messages to add somewhere, say so :-) - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/7brYACgkQtTMYHG2NR9VsBwCbBSMMPKhLpRRMp4hn93gnouAz NFsAn0ro3t8VOAeFoNYGhJsfGxzkn3T6 =wdLm -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday, 2012-07-10 at 01:52 +0200, Carlos E. R. wrote:
On Monday, 2012-07-09 at 10:52 -0400, Jeff Mahoney wrote:
On 6/29/12 10:26 AM, Carlos E. R. wrote:
...
So I then added more messages to dpm_suspend_start() and I'm waiting for it to crash and take a photo instead: it could happen tonight or in a week.
The process I'm following is reported in the bugzilla: <https://bugzilla.novell.com/show_bug.cgi?id=765084>, and in this mail thread:
<http://lists.opensuse.org/opensuse/2012-06/msg00912.html> <http://lists.opensuse.org/opensuse/2012-07/msg00000.html>
Last events are here: <http://lists.opensuse.org/opensuse/2012-07/msg00132.html> and the next 3 messages.
Historically, most suspend problems are driver issues. If you have some time, try unloading individual modules before attempting the suspend. Once you're able to suspend reliably again, it should be clear which module is at fault. I'd start with the usb audio driver.
Ah.
Time I have... I have to do something, it is crashing randomly when I hibernate. It is a nuisance, hibernation is an important part of my work methodology.
Then I would do "rmmod snd_usb ; rmmod snd_usb_audio", hibernate, and on thawing I would have to modprobe them. Or perhaps "rcalsa stop"?
Or seeing the last message before the crash, should I try another driver instead, perhaps the network?
Or, if you have suggestions about more printk messages to add somewhere, say so :-)
I forgot that when it crashes, it would be at the point of the next messages that did not print, not at the last message printed. Then the important messages would be these, related to sound and usb - as you said: <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.291119] usb 2-5.4: reset high speed USB device using ehci_hcd and address 4 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.953016] snd-usb-audio 2-5.4:1.2: no reset_resume for driver snd-usb-audio? <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.953214] snd-usb-audio 2-5.4:1.3: no reset_resume for driver snd-usb-audio? <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.955419] PM: restore of devices complete after 2646.655 msecs Thus my plan will be to wait for the next crash, take the photo, and then add a script to stop sound and remove those modules, plus undo on restore. - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/8AkwACgkQtTMYHG2NR9XL9ACfa/1aBoDN4YiIGRf4EOPj/BbS mCYAnA8pnL7LsRWtXzl8OeqkKQxHlyis =2ucp -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
At Tue, 10 Jul 2012 01:52:16 +0200 (CEST), Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Monday, 2012-07-09 at 10:52 -0400, Jeff Mahoney wrote:
On 6/29/12 10:26 AM, Carlos E. R. wrote:
Thus it is the ioctl() function call that is not returning!
Yep, this is where it would have to fail. That's the kernel entry point to create the snapshot. There is a ton of heavy lifting that goes along with it behind the scenes in kernel-space.
Unfortunately, the now-common way of adding events (tracepoints) doesn't really help here since there'd be no way to dump them.
Well, since I posted that I made some progress. As a matter of fact, I'm now running a modified kernel with lots of printk statements "savagely" sprinkled around :-)
Last crash was about here:
/usr/src/linux/kernel/power/hibernate.c:
int hibernation_snapshot(int platform_mode) {
...
printk(KERN_INFO "I was here: 2.4\n"); error = dpm_suspend_start(PMSG_FREEZE); <====== if (error) goto Recover_platform;
printk(KERN_INFO "I was here: 2.5\n");
It printed the 2.4 mark, then several messages about the disk system (no errors), and the last message was like this (this one is from another run):
.... Telcontar kernel - - - [ 792.002714] r8169 0000:06:00.0: eth0: link up
I made a video of the crash messages, but they are almost unreadable; I deduced the messages by comparison with a sucesful hibernate cycle.
So I then added more messages to dpm_suspend_start() and I'm waiting for it to crash and take a photo instead: it could happen tonight or in a week.
The process I'm following is reported in the bugzilla: <https://bugzilla.novell.com/show_bug.cgi?id=765084>, and in this mail thread:
<http://lists.opensuse.org/opensuse/2012-06/msg00912.html> <http://lists.opensuse.org/opensuse/2012-07/msg00000.html>
Last events are here: <http://lists.opensuse.org/opensuse/2012-07/msg00132.html> and the next 3 messages.
Historically, most suspend problems are driver issues. If you have some time, try unloading individual modules before attempting the suspend. Once you're able to suspend reliably again, it should be clear which module is at fault. I'd start with the usb audio driver.
Ah.
Time I have... I have to do something, it is crashing randomly when I hibernate. It is a nuisance, hibernation is an important part of my work methodology.
Then I would do "rmmod snd_usb ; rmmod snd_usb_audio", hibernate, and on thawing I would have to modprobe them. Or perhaps "rcalsa stop"?
A safer way is to add such a module to the module blacklist so that it won't be loaded at all, then check S4 to see whether the problem persists or not. Takashi -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-16 10:47, Takashi Iwai wrote:
At Tue, 10 Jul 2012 01:52:16 +0200 (CEST), Carlos E. R. wrote:
A safer way is to add such a module to the module blacklist so that it won't be loaded at all, then check S4 to see whether the problem persists or not.
Sorry, I do not know what is S4 or how to test it. But if I remove the sound modules that way, I will be without sound for days! It can take days for the hibernation to crash. Sometimes it crashes in a day, sometimes in a week. Anyway, stopping alsa (which does remove the usb.sound modules) does not work, it crashes. Currently I'm stopping alsa, network, and named (because named ceases to work if network is stopped and restarted), to no avail. I do not know what to do.
<0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37787.944149] ata5.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37787.948534] ata5.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out <0.7> 2012-07-15 22:31:24 Telcontar kernel - - - [37787.952999] ata5.00: ACPI cmd c6/00:10:00:00:00:a0 (SET MULTIPLE MODE) succeeded <0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37787.957316] ata5.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37787.967360] ata6.00: configured for UDMA/133 <0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37787.971958] ata5.00: configured for UDMA/133 <0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37787.979373] ata6.01: configured for UDMA/133 <0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37787.983793] ata5.01: configured for UDMA/133
That one above ^ is the last message printed before it crashes. The messages below come from a successful hibernation.
<0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37788.308056] usb 8-2: reset low speed USB device using uhci_hcd and address 2 <0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37788.692058] usb 2-5: reset high speed USB device using ehci_hcd and address 2 <0.6> 2012-07-15 22:31:24 Telcontar kernel - - - [37789.086143] usb 2-5.4: reset high speed USB device using ehci_hcd and address 4
So next thing to try would be to unload the entire usb system. How do I do that? - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlAEGM0ACgkQIvFNjefEBxq9TACgi4kg63vKhZC+8kRbnIrhionb hPsAoM5iJAsO+6oyZrsRiTLkMEbxa/Kd =TV/B -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
At Mon, 16 Jul 2012 15:36:13 +0200, Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2012-07-16 10:47, Takashi Iwai wrote:
At Tue, 10 Jul 2012 01:52:16 +0200 (CEST), Carlos E. R. wrote:
A safer way is to add such a module to the module blacklist so that it won't be loaded at all, then check S4 to see whether the problem persists or not.
Sorry, I do not know what is S4 or how to test it.
S4 = hibernation. Add a line "blacklist snd-usb-audio" to /etc/modprobe.d/99-local.conf or such.
But if I remove the sound modules that way, I will be without sound for days!
Yes.
It can take days for the hibernation to crash. Sometimes it crashes in a day, sometimes in a week.
Sorry for that, but not loading a module is the only safe way. Unloading a module later isn't as same as not loading a module at all.
Anyway, stopping alsa (which does remove the usb.sound modules) does not work, it crashes. Currently I'm stopping alsa, network, and named (because named ceases to work if network is stopped and restarted), to no avail.
I do not know what to do.
It'd be better to find a way to reproduce the problem more quickly. With a script, you can test two hibernation cycles per minute, for example. Takashi -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-16 15:42, Takashi Iwai wrote:
At Mon, 16 Jul 2012 15:36:13 +0200,
Carlos E. R. wrote:
Sorry, I do not know what is S4 or how to test it.
S4 = hibernation.
Ah.
Add a line "blacklist snd-usb-audio" to /etc/modprobe.d/99-local.conf or such.
But if I remove the sound modules that way, I will be without sound for days!
Yes.
Argh.
It can take days for the hibernation to crash. Sometimes it crashes in a day, sometimes in a week.
Sorry for that, but not loading a module is the only safe way. Unloading a module later isn't as same as not loading a module at all.
Mmm.
Anyway, stopping alsa (which does remove the usb.sound modules) does not work, it crashes. Currently I'm stopping alsa, network, and named (because named ceases to work if network is stopped and restarted), to no avail.
I do not know what to do.
It'd be better to find a way to reproduce the problem more quickly. With a script, you can test two hibernation cycles per minute, for example.
On occasion, I have tried 10 hibernation cycles in minutes without crash. I start working, and when I hibernate hours later, it crashes. Ok, will do. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlAEG6wACgkQIvFNjefEBxoKrACdHPyZdv3rX+NcK772muEdjf1G CmsAoIPCevU+DD1mLTxMzrGHy5s9jrVR =SZs4 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday, 2012-07-16 at 15:48 +0200, Carlos E. R. wrote:
On 2012-07-16 15:42, Takashi Iwai wrote:
Ok, will do.
I blacklisted that module. I still have sound, because I do not have any sound hardware via usb. I wrote a script to loop hibernation cycles, and set suspend.conf to reboot instead of poweroff. I did 30 cycles in half an hour sucessfully. Surprising. I will add something to do to the script, like fetchmail and fetchnews, perhaps a video conversion with ffmpeg, and leave the script running overnight... If it works, it is a success, thankyou, I can live without that module :-) - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAlAEO4wACgkQtTMYHG2NR9XwKACeKdjEKomMKyAQiYe+kEmgNOx0 BoUAniVq93+jYTX9BBDr7PsPCUR8nyTP =ZaLn -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
At Mon, 16 Jul 2012 18:04:20 +0200 (CEST), Carlos E. R. wrote:
On Monday, 2012-07-16 at 15:48 +0200, Carlos E. R. wrote:
On 2012-07-16 15:42, Takashi Iwai wrote:
Ok, will do.
I blacklisted that module. I still have sound, because I do not have any sound hardware via usb. I wrote a script to loop hibernation cycles, and set suspend.conf to reboot instead of poweroff.
I did 30 cycles in half an hour sucessfully. Surprising.
I'd recommend you to try to remove the blacklist and test the script again to see whether you hit the problem in 30 cycles. If it causes the problem again, then we can conclude that it's very likely the culprit. Takashi -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Content-ID: <alpine.LNX.2.00.1207162150080.14445@Telcontar.valinor> On Monday, 2012-07-16 at 18:08 +0200, Takashi Iwai wrote:
At Mon, 16 Jul 2012 18:04:20 +0200 (CEST),
I did 30 cycles in half an hour sucessfully. Surprising.
I'd recommend you to try to remove the blacklist and test the script again to see whether you hit the problem in 30 cycles. If it causes the problem again, then we can conclude that it's very likely the culprit.
Did that - problem remains the same. :-( First I did 15 cycles with the same script, no crash. Thus I modified the script to do some activity, like this: +++······················· #!/bin/bash for((i=0;i<100;i++)) do echo "=== Test number $i =====================" | tee -a probarhibernacion fetchmail -v & fetchnews -v wait chvt 7 sleep 5 chvt 1 sleep 2 date pm-hibernate date done ·······················++- I went out for a walk, watched the waves break on the sea shore for an hour or so. Came back, the computer was on cycle 47, no crash. The crash must be related to some of the activity I do on the computer normally (I also tried weeks ago varying my activity: no go). These are the usb modules loaded now: +++······················· Telcontar:~ # lsmod | grep usb snd_usb_audio 120010 1 snd_usbmidi_lib 23849 1 snd_usb_audio snd_rawmidi 26923 1 snd_usbmidi_lib snd_hwdep 7772 2 snd_usb_audio,snd_hda_codec snd_pcm 104468 4 snd_pcm_oss,snd_usb_audio,snd_hda_intel,snd_hda_codec snd 84374 19 snd_pcm_oss,snd_mixer_oss,snd_seq,snd_hda_codec_ca0110,snd_usb_audio,snd_hda_intel,snd_usbmidi_lib,snd_hda_codec,snd_rawmidi,snd_hwdep,snd_pcm,snd_timer,snd_seq_device Telcontar:~ # ·······················++- Notice that blacklisting "snd-usb-audio" disables the whole lot. I will blacklist that module again, reboot, do "normal life" usage of the computer, and see if it crashes, days or weeks - I hope it doesn't, I'm tired of this. - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAlAEcLcACgkQtTMYHG2NR9UmFQCePW4I6c5SBziVl4J8M+kTckui QfwAn10nUnza1uBlYs/MhWhjhZXDff1L =aQux -----END PGP SIGNATURE-----
On Monday, 2012-07-16 at 21:51 +0200, Carlos E. R. wrote: ...
Notice that blacklisting "snd-usb-audio" disables the whole lot. I will blacklist that module again, reboot, do "normal life" usage of the computer, and see if it crashes, days or weeks - I hope it doesn't, I'm tired of this.
So I did that and started using the computer. A few minutes ago I decided to call it the day, hibernated, and the damn thing crashed again in the exact same place as always. At the first hibernation attempt. :-/ So what can I do now? -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
participants (3)
-
Carlos E. R.
-
Jeff Mahoney
-
Takashi Iwai