[opensuse] Need help: I found out where s2disk crashes.

older
[opensuse] wine app shows no text...

Carlos E. R.

28 Jun 2012 28 Jun '12

09:09

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm using 11.4. Hibernation, which has been relieably for years, is now crashing at the "Snapshotting system" message since about two or three months, after two or more hibernation cycles. I have hacked the suspend_system() function in suspend.c, adding messages that get printed to the screen so that I can see where it gets stuck. After doing the modification, system refused to crash for a week, till yesterday: finally, I got a crash after the "step 1.2" message: sprintf(message, "Snapshotting system - step 1.1"); printf("%s: %s\n", my_name, message); if (set_image_size(snapshot_fd, image_size)) { error = errno; break; } sprintf(message, "atomic_snapshot - step 1.2"); printf("%s: %s\n", my_name, message); <=========== if (atomic_snapshot(snapshot_fd, &in_suspend)) { error = errno; break; } So, it appears that atomic_snapshot() fails, and fails to report, and kernel crashes silently. What is the next step, what can I do? I'm no kernel dev, I need help! https://bugzilla.novell.com/show_bug.cgi?id=765084#c5 - -- Cheers, Carlos E. R. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/sH24ACgkQtTMYHG2NR9Vw9wCfVSCa0I1N5QT1TFW9p6y5Il4+ jwIAn3jENKU7Gs+Bob4ZFmyBrXvJRqt1 =gLuF -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Show replies by date

Robert Rawlinson

28 Jun 28 Jun

11:25

On 06/28/2012 05:09 AM, Carlos E. R. wrote:

...

I'm using 11.4. Hibernation, which has been relieably for years, is now crashing at the "Snapshotting system" message since about two or three months, after two or more hibernation cycles.

I have hacked the suspend_system() function in suspend.c, adding messages that get printed to the screen so that I can see where it gets stuck. After doing the modification, system refused to crash for a week, till yesterday: finally, I got a crash after the "step 1.2" message:

sprintf(message, "Snapshotting system - step 1.1"); printf("%s: %s\n", my_name, message); if (set_image_size(snapshot_fd, image_size)) { error = errno; break; } sprintf(message, "atomic_snapshot - step 1.2"); printf("%s: %s\n", my_name, message); <=========== if (atomic_snapshot(snapshot_fd, &in_suspend)) { error = errno; break; }

So, it appears that atomic_snapshot() fails, and fails to report, and kernel crashes silently.

What is the next step, what can I do? I'm no kernel dev, I need help!

https://bugzilla.novell.com/show_bug.cgi?id=765084#c5

-- Cheers, Carlos E. R.

I have been having the same problem. Bob R -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

11:39

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-28 13:25, Robert Rawlinson wrote:

...

I have been having the same problem.

Please, add a me too at the bugzilla. Perhaps you can try my hacks if somebody packages them. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/sQnYACgkQIvFNjefEBxoxcwCfQnTxrFakdYdwjj4JDgNKC5Nx o94AoKgPzYXTl+sYBjGJbzXjHw4bLa3M =8hD6 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

29 Jun 29 Jun

11:23

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday, 2012-06-28 at 11:09 +0200, Carlos E. R. wrote:

...

So, it appears that atomic_snapshot() fails, and fails to report, and kernel crashes silently.

I added printf statementst there: static int atomic_snapshot(int dev, int *in_suspend) { int error; printf("Atomic_snapshot: 1"); error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend); printf(", 2"); if (error && errno == ENOTTY) { printf(", 3 (err)"); report_unsupported_ioctl("SNAPSHOT_CREATE_IMAGE"); error = ioctl(dev, SNAPSHOT_ATOMIC_SNAPSHOT, in_suspend); } printf(", 4\n"); return error; } I did two successful hibernations, and it failed on the third, printing only: Atomic_snapshot: 1 Thus it is the ioctl() function call that is not returning. But now I need the devs to take over and solve that problem, and none has answered yet. - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/tkEAACgkQtTMYHG2NR9VdNQCgkGKGN422//h4H3DsLuaofwXk xBEAnj9+UW1OfWAQ7z+XNvtixy89YLgC =aNhs -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Patrick Shanahan

11:28

* Carlos E. R. <robin.listas@telefonica.net> [06-29-12 07:25]:

...

On Thursday, 2012-06-28 at 11:09 +0200, Carlos E. R. wrote:

...
So, it appears that atomic_snapshot() fails, and fails to report, and kernel crashes silently.

I added printf statementst there:

static int atomic_snapshot(int dev, int *in_suspend) { int error; printf("Atomic_snapshot: 1");

error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend); printf(", 2"); if (error && errno == ENOTTY) { printf(", 3 (err)"); report_unsupported_ioctl("SNAPSHOT_CREATE_IMAGE"); error = ioctl(dev, SNAPSHOT_ATOMIC_SNAPSHOT, in_suspend); } printf(", 4\n"); return error; }

I did two successful hibernations, and it failed on the third, printing only:

Atomic_snapshot: 1

Thus it is the ioctl() function call that is not returning.

But now I need the devs to take over and solve that problem, and none has answered yet.

possibly the discussion would be viewed and proper in *factory* as it undoubtedly affects newer versions also ??? -- (paka)Patrick Shanahan Plainfield, Indiana, USA HOG # US1244711 http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 http://en.opensuse.org openSUSE Community Member Registered Linux User #207535 @ http://linuxcounter.net -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

11:34

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-29 13:28, Patrick Shanahan wrote:

...

* Carlos E. R. <> [06-29-12 07:25]:

...

possibly the discussion would be viewed and proper in *factory* as it undoubtedly affects newer versions also ???

That's my next recourse, but I'm hesitant to do it lest they say that it is not a factory issue. Or they ask me to reproduce in factory, which is absurd, as 11.4 is a maintained version and I have done all the debugging I can. The ioctl is a system function call! Maybe I have to enter another bugzilla named "ioctl() hangs!" - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/tktwACgkQIvFNjefEBxqv+gCffMHWbM7lC8UWe5vP882Ohpl2 xsIAn0A32kiPu8k+RTQLofCEQLlJCuMi =cuDT -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Dave Howorth

13:03

Carlos E. R. wrote:

...

Maybe I have to enter another bugzilla named "ioctl() hangs!"

That shouldn't be necessary :( I have two suggestions to get attention and one to make further progress, and a question and a comment: (S1) make a posting on the opensuse-kernel list about the issue, (S2) contact the upstream developer if possible (S3) If I understand correctly, your system (kernel) crashes? In which case: (a) copy your kernel output either to your serial port (to a terminal or file on another system) or over the network to another machine's kernel log - you may see some messages which don't make it to the machine's own screen (b) increase the kernel's logging level to obtain extra diagnostics (Q4) Somewhere at the beginning, you mention nvidia. Does that mean you're running a tainted kernel? If so, can you reproduce the problem with an untainted one? (C5) If it is a kernel bug, you really need to be testing against a recent kernel. I don't think you've said what version you're using. But only bugs in recent kernels are likely to get any attention. Then fixes may be backported to other supported kernels. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

14:08

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-29 15:03, Dave Howorth wrote:

...

Carlos E. R. wrote:

...
Maybe I have to enter another bugzilla named "ioctl() hangs!"

That shouldn't be necessary :( I have two suggestions to get attention and one to make further progress, and a question and a comment:

(S1) make a posting on the opensuse-kernel list about the issue,

Good idea.

...

(S2) contact the upstream developer if possible

I was trying to track him, but openSUSE has about a dozen patches applied.

...

(S3) If I understand correctly, your system (kernel) crashes? In which case:

What I have discovered is that a call to ioctl() does not return. I don't know if the function is part of libc or of the kernel.

...

(a) copy your kernel output either to your serial port (to a terminal or file on another system) or over the network to another machine's kernel log - you may see some messages which don't make it to the machine's own screen

No possible, I already asked in the bugzilla if this could be done. s2disk doesn't output to serial port. During this phase there is only a text console working, messages are printed there, and when the problem happens nothing prints any message there. There is nothing nowhere.

...

(b) increase the kernel's logging level to obtain extra diagnostics

Not possible: writing to disk is disabled when hibernation starts its process.

...

(Q4) Somewhere at the beginning, you mention nvidia. Does that mean you're running a tainted kernel? If so, can you reproduce the problem with an untainted one?

The last time I tried with the open source driver, the machine would not even hibernate except on runlevel 3. That's the reason I use the proprietary nvidia driver.

...

(C5) If it is a kernel bug, you really need to be testing against a recent kernel. I don't think you've said what version you're using. But only bugs in recent kernels are likely to get any attention. Then fixes may be backported to other supported kernels.

Recent kernels also have problems hibernating as other people have told me; but I'm the only one that has resorted to hacking at s2disk, and I'm using 11.4. Today I have updated the kernel, now is Linux Telcontar 2.6.37.6-0.20-desktop #1 SMP PREEMPT 2011-12-19 23:39:38 +0100 x86_64 x86_64 x86_64 GNU/Linux I have the knowledge to hack at my own s2disk, but I failed to create rpms for it, thus nobody else can try my hacks unless a dev creates them. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/ttu4ACgkQIvFNjefEBxrcvwCffeIs+/4zxXCQCSQ3D7zqbhmc XC0Ani+XidTUxreAzSgMwSin0D3Uwvuq =/RGE -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Dave Howorth

14:53

Carlos E. R. wrote:

...

...
(S3) If I understand correctly, your system (kernel) crashes? In which case:

What I have discovered is that a call to ioctl() does not return. I don't know if the function is part of libc or of the kernel.

ioctls are kernel entry points. If you think it might be crashing before it gets to the kernel, you could single-step the application to see exactly what fails.

...

...
(a) copy your kernel output either to your serial port (to a terminal or file on another system) or over the network to another machine's kernel log - you may see some messages which don't make it to the machine's own screen

No possible, I already asked in the bugzilla if this could be done. s2disk doesn't output to serial port.

No, not the application's output. The kernel's log.

...

During this phase there is only a text console working, messages are printed there, and when the problem happens nothing prints any message there. There is nothing nowhere.

I don't know how hibernation or s2disk work, but if the kernel is still logging anything then you want to access it. It doesn't do any harm to at least try does it? Your alternative is to single-step the kernel, I guess.

...

...
(b) increase the kernel's logging level to obtain extra diagnostics

Not possible: writing to disk is disabled when hibernation starts its process.

Logging to serial port or network does not require writing to disk. There appears to be a specific parameter: suspend loglevel u can specify the kernel console loglevel which the s2disk/s2both and resume utilities will use to report progress. On a stock kernel messages with level higher then 7 are usually not shown.

...

...
(Q4) Somewhere at the beginning, you mention nvidia. Does that mean you're running a tainted kernel? If so, can you reproduce the problem with an untainted one?

The last time I tried with the open source driver, the machine would not even hibernate except on runlevel 3. That's the reason I use the proprietary nvidia driver.

OK, but as long as you use the nvidia driver, you will have to isolate exactly what is failing in the kernel and provide a method to reproduce the problem on hardware that can run an untainted kernel. Then the devs will look at it. So it is probably easier to bugfix the open source driver - kernel devs will help you with that - then see if the suspend problem still occurs, and if so, debug it with the devs help. If you're lucky there may be a single cause for both problems.

...

...
(C5) If it is a kernel bug, you really need to be testing against a recent kernel. I don't think you've said what version you're using. But only bugs in recent kernels are likely to get any attention. Then fixes may be backported to other supported kernels.

Recent kernels also have problems hibernating as other people have told me; but I'm the only one that has resorted to hacking at s2disk, and I'm using 11.4.

The recent kernel problems were resolved by backing out a particular faulty update. Thus they are a separate problem (unless your kernel also has the faulty update?) But again, you should be testing with a recent kernel. Perhaps the problem has already been fixed? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

17:59

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-29 16:53, Dave Howorth wrote:

...

Carlos E. R. wrote:

...
...
(S3) If I understand correctly, your system (kernel) crashes? In which case:

What I have discovered is that a call to ioctl() does not return. I don't know if the function is part of libc or of the kernel.

ioctls are kernel entry points. If you think it might be crashing before it gets to the kernel, you could single-step the application to see exactly what fails.

What fails is the ioctl() function, that is proven already. See my code... printf("Atomic_snapshot: 1"); error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend); printf(", 2"); The "1" prints, but not the "2", in hangs inside.

...

...
No possible, I already asked in the bugzilla if this could be done. s2disk doesn't output to serial port.

No, not the application's output. The kernel's log.

As I said, I asked if debugging while s2disk runs would be possible, and I suggested using the serial port - see the bugzilla - and they said no.

...

...
During this phase there is only a text console working, messages are printed there, and when the problem happens nothing prints any message there. There is nothing nowhere.

I don't know how hibernation or s2disk work, but if the kernel is still logging anything then you want to access it. It doesn't do any harm to at least try does it?

I wanted to do it and they said "no". So I invented the method of printing messages. I added printf() calls directly in the code. They did not even tell me that trick.

...

Your alternative is to single-step the kernel, I guess.

s2disk suspends all processes, nothing is running. And I don't know how to use a debugger in Linux, even less with the kernel.

...

...
...
(b) increase the kernel's logging level to obtain extra diagnostics

Not possible: writing to disk is disabled when hibernation starts its process.

Logging to serial port or network does not require writing to disk.

They said "no".

...

There appears to be a specific parameter:

suspend loglevel u can specify the kernel console loglevel which the s2disk/s2both and resume utilities will use to report progress. On a stock kernel messages with level higher then 7 are usually not shown.

Where do I use that? I have this in "/etc/sysconfig/syslog": ## Path: System/Logging ## Description: System logging ## Type: list(0,1,2,3,4,5,6,7) ## Default: 1 ## Config: "" ## ServiceRestart: syslog # # Default loglevel for klogd # KERNEL_LOGLEVEL=7 So it is 7 already. Right now I edited "/etc/suspend.conf": suspend loglevel = 255 I don't know what levels are valid, 255, as good a number as any. As far as I know, the kernel logs to tty 10, which is the same one the s2disk process uses. Nothing is written there, except s2disk own messages, and my own added messages. If you know a way to tell the kernel to be far more verbose and log what happens inside that ioctl call, tell me, I'll do it. But what I think is needed is somebody to hack at the kernel code and add printf() calls like I did inside the ioctl function. Or tell me how to do that myself and I'll make that kernel here.

...

...
...
(Q4) Somewhere at the beginning, you mention nvidia. Does that mean you're running a tainted kernel? If so, can you reproduce the problem with an untainted one?

The last time I tried with the open source driver, the machine would not even hibernate except on runlevel 3. That's the reason I use the proprietary nvidia driver.

OK, but as long as you use the nvidia driver, you will have to isolate exactly what is failing in the kernel and provide a method to reproduce the problem on hardware that can run an untainted kernel. Then the devs will look at it.

Nobody looks at it, nobody answers at the bugzilla, nobody from the kernel list where I wrote to answered my post. At the time I wrote a bugzilla about the open driver not suspending I want to try factory with the open driver, but I can not do because factory is totally broken and no expectations of repair in sight. :-/ Many machines do not even work with the open driver, and gnome people want us to have 3D and hw accel... whom do I listen to?

...

...
Recent kernels also have problems hibernating as other people have told me; but I'm the only one that has resorted to hacking at s2disk, and I'm using 11.4.

The recent kernel problems were resolved by backing out a particular faulty update. Thus they are a separate problem (unless your kernel also has the faulty update?)

No, these people were having problems prior to that update.

...

But again, you should be testing with a recent kernel. Perhaps the problem has already been fixed?

I doubt it. One person has tried 11.4, 12.1, tumbleweed. Same problem in all. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/t7OwACgkQIvFNjefEBxpFtwCdGP1GEry69LXLngkfNbcyQh0m iCEAoJ7elGn7SZfU+TfitQ/S/FBU6Q4W =SqG/ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Per Jessen

19:30

Carlos E. R. wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 2012-06-29 16:53, Dave Howorth wrote:

...
Carlos E. R. wrote:

...
...
(S3) If I understand correctly, your system (kernel) crashes? In which case:

What I have discovered is that a call to ioctl() does not return. I don't know if the function is part of libc or of the kernel.

ioctls are kernel entry points. If you think it might be crashing before it gets to the kernel, you could single-step the application to see exactly what fails.

What fails is the ioctl() function, that is proven already. See my code...

printf("Atomic_snapshot: 1"); error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend); printf(", 2");

The "1" prints, but not the "2", in hangs inside.

...
...
No possible, I already asked in the bugzilla if this could be done. s2disk doesn't output to serial port.

No, not the application's output. The kernel's log.

As I said, I asked if debugging while s2disk runs would be possible, and I suggested using the serial port - see the bugzilla - and they said no.

The only comment in the bugreport (765084) is Dirk Mueller saying "I have no idea how to debug this." :-) Is there another bugreport? -- Per Jessen, Zürich (26.8°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

20:03

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-29 21:30, Per Jessen wrote:

...

Carlos E. R. wrote:

...

The only comment in the bugreport (765084) is Dirk Mueller saying "I have no idea how to debug this." :-) Is there another bugreport?

This one: Bug 472861 - 11.1, and 11.3, fail to hibernate if kernel serial port (rs232) logging is active The last patch in the update repo was precisely to correct this problem with s2disk. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/uCigACgkQIvFNjefEBxosqwCeKMp2cHGN92JYp0DafEIZw/JU AWUAn2S3/GVdrytxecgu6u3ET+smdyp1 =VCyr -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

David Haller

20:13

Hello, On Fri, 29 Jun 2012, Carlos E. R. wrote:

...

What fails is the ioctl() function, that is proven already. See my code...

printf("Atomic_snapshot: 1"); error = ioctl(dev, SNAPSHOT_CREATE_IMAGE, in_suspend); printf(", 2");

The "1" prints, but not the "2", in hangs inside. [..] As far as I know, the kernel logs to tty 10, which is the same one the s2disk process uses. Nothing is written there, except s2disk own messages, and my own added messages. If you know a way to tell the kernel to be far more verbose and log what happens inside that ioctl call, tell me, I'll do it. But what I think is needed is somebody to hack at the kernel code and add printf() calls like I did inside the ioctl function. Or tell me how to do that myself and I'll make that kernel here.

/usr/src/linux/kernel/power/user.c search for SNAPSHOT_CREATE_IMAGE. Continued in /usr/src/linux/kernel/power/hibernate.c search for hibernation_snapshot. As for printing: use printk, see Documentation/printk-formats.txt HTH, -dnh -- William, tell me somethin'. Have you come because you need my help to save a certain distressin' damsel? Or... rather a damsel in distress? Either one... -- Jack Sparrow, Pirates of the Caribbean: At World's End -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

23:21

On 2012-06-29 22:13, David Haller wrote:

...

...
Or tell me how to do that myself and I'll make that kernel here.

/usr/src/linux/kernel/power/user.c search for SNAPSHOT_CREATE_IMAGE. Continued in /usr/src/linux/kernel/power/hibernate.c search for hibernation_snapshot.

Now that's very interesting.

...

As for printing: use printk, see Documentation/printk-formats.txt

Huh, that's not an example of good documentation :-) It is very terse. I just need to print strings to know by which sections the code passes and where it gets stuck: printk(KERN_INFO "I was here - 1.0\n"); I see an interesting 'ifdef' in hibernate.c CONFIG_PM_DEBUG. Where do I define that? (And I also see 'gotos' :-o ) -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar)

Carlos E. R.

30 Jun 30 Jun

00:04

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-30 01:21, Carlos E. R. wrote:

...

I see an interesting 'ifdef' in hibernate.c CONFIG_PM_DEBUG. Where do I define that?

Ah, of course, found it. I'm now compiling a kernel based on the desktop variant with some printk messages in power/user.c. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/uQpUACgkQIvFNjefEBxr2gwCg06K4ApgXg3Yoe2Z68zgINR2E CgYAoI/KfrMCT8hbuQcUIVMO7ba3XajR =Yd9z -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

00:55

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Saturday, 2012-06-30 at 02:04 +0200, Carlos E. R. wrote:

...

I'm now compiling a kernel based on the desktop variant with some printk messages in power/user.c.

I installed that kernel. It uses 1.7 gigabytes in /lib/modules, and the sources expanded to 7 GiB. I don't know why is that. Debug info? (see below). I rebooted that kernel, and tried one hibernation, sucessfull. I saw a lot of messages going by that I had never seen - those could be captured with a serial port, I suppose. On thawing I see some of those, I'm not sure. I do see some of my new printk statements (I was here: x.y) So now I have to wait till it crashes again (they are random). It could take up to a week... These are the messagges in the log - they are saved to file recovered after thawing, except those printed by s2disk which are lost. I don't know the exact point where thawing starts in the log: 5.4> 2012-06-30 02:28:18 Telcontar pm-utils - - - Hibernating (1)... <5.4> 2012-06-30 02:28:20 Telcontar pm-utils - - - Hibernating (95)... <0.7> 2012-06-30 02:28:27 Telcontar kernel - - - [ 108.836042] PM: Marking nosave pages: 000000000009f000 - 0000000000100000 <0.7> 2012-06-30 02:28:27 Telcontar kernel - - - [ 108.836046] PM: Marking nosave pages: 00000000bff90000 - 0000000100000000 <0.7> 2012-06-30 02:28:27 Telcontar kernel - - - [ 108.836673] PM: Basic memory bitmaps created <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 109.455302] Syncing filesystems ... done. <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.021298] Freezing user space processes ... (elapsed 0.01 seconds) done. <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.032098] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.043465] I was here: 1.0 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.043517] I was here: 1.1 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.043569] I was here: 1.2 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.043621] PM: Preallocating image memory... done (allocated 291478 pages) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.241767] PM: Allocated 1165912 kbytes in 0.19 seconds (6136.37 MB/s) <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.241888] Suspending console(s) (use no_console_suspend to debug) <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.242533] sd 5:0:1:0: [sdd] Synchronizing SCSI cache <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.242567] sd 5:0:0:0: [sdc] Synchronizing SCSI cache <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.242573] sd 4:0:1:0: [sdb] Synchronizing SCSI cache <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.242654] sd 4:0:0:0: [sda] Synchronizing SCSI cache <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.242781] i8042 kbd 00:08: wake-up capability enabled by ACPI <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.242976] serial 00:06: disabled <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.243322] ahci 0000:05:00.0: PCI INT A disabled <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.243330] ahci 0000:04:00.0: PCI INT A disabled <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.243342] ata_piix 0000:00:1f.5: PCI INT B disabled <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.243536] pata_jmicron 0000:04:00.1: PCI INT B disabled <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.243539] ata_piix 0000:00:1f.2: PCI INT B disabled <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.243553] pata_jmicron 0000:05:00.1: PCI INT B disabled <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.244041] ACPI handle has no context! <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.347055] HDA Intel 0000:03:00.0: PCI INT A disabled <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.347097] ACPI handle has no context! <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.358197] PM: freeze of devices complete after 115.888 msecs <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.358721] PM: late freeze of devices complete after 0.523 msecs <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.358723] Disabling non-boot CPUs ... <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.371764] CPU 1 is now offline <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.388218] CPU 2 is now offline <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.396075] Broke affinity for irq 18 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.397128] CPU 3 is now offline <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.397129] SMP alternatives: switching to UP code <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.401909] PM: Creating hibernation image: <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.402011] PM: Need to copy 290653 pages <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.402011] PM: Normal pages needed: 290653 + 1024, available pages: 1806127 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.402011] microcode: CPU0 updated to revision 0xa0b, date = 2010-09-28 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.402011] Enabling non-boot CPUs ... <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.403029] SMP alternatives: switching to SMP code <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.407496] Booting Node 0 Processor 1 APIC 0x1 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.510229] NMI watchdog enabled, takes one hw-pmu counter. <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.511260] microcode: CPU1 updated to revision 0xa0b, date = 2010-09-28 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.517024] CPU1 is up <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.517203] Booting Node 0 Processor 2 APIC 0x2 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.620330] NMI watchdog enabled, takes one hw-pmu counter. <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.621521] microcode: CPU2 updated to revision 0xa0b, date = 2010-09-28 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.626044] CPU2 is up <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.626200] Booting Node 0 Processor 3 APIC 0x3 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.729428] NMI watchdog enabled, takes one hw-pmu counter. <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.730586] microcode: CPU3 updated to revision 0xa0b, date = 2010-09-28 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.735021] CPU3 is up <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737442] ata_piix 0000:00:1f.2: restoring config space at offset 0x1 (was 0x2b00003, writing 0x2b00007) <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737478] ata_piix 0000:00:1f.5: restoring config space at offset 0x1 (was 0x2b00003, writing 0x2b00007) <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737608] HDA Intel 0000:03:00.0: restoring config space at offset 0x1 (was 0x2900006, writing 0x2900002) <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737657] ahci 0000:04:00.0: restoring config space at offset 0x1 (was 0x100003, writing 0x100007) <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737711] pata_jmicron 0000:04:00.1: restoring config space at offset 0x1 (was 0x100001, writing 0x100005) <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737758] ahci 0000:05:00.0: restoring config space at offset 0x1 (was 0x100003, writing 0x100007) <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737811] pata_jmicron 0000:05:00.1: restoring config space at offset 0x1 (was 0x100001, writing 0x100005) <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737855] r8169 0000:06:00.0: restoring config space at offset 0x1 (was 0x100007, writing 0x100407) <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.737922] r8169 0000:07:00.0: restoring config space at offset 0x1 (was 0x100007, writing 0x100407) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.738118] PM: early restore of devices complete after 1.129 msecs <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801759] pci 0000:00:06.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801763] pci 0000:00:06.0: setting latency timer to 64 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801779] uhci_hcd 0000:00:1a.0: setting latency timer to 64 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801782] uhci_hcd 0000:00:1a.1: setting latency timer to 64 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801784] uhci_hcd 0000:00:1a.2: setting latency timer to 64 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801785] ehci_hcd 0000:00:1a.7: setting latency timer to 64 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801831] usb usb3: root hub lost power or was reset <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801835] usb usb4: root hub lost power or was reset <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801839] usb usb5: root hub lost power or was reset <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801864] usb usb1: root hub lost power or was reset <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801885] pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801888] pci 0000:00:1c.0: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801904] pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801907] uhci_hcd 0000:00:1d.0: setting latency timer to 64 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801909] pci 0000:00:1c.2: setting latency timer to 64 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801918] uhci_hcd 0000:00:1d.1: setting latency timer to 64 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801949] usb usb6: root hub lost power or was reset <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801963] uhci_hcd 0000:00:1d.2: setting latency timer to 64 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801969] usb usb7: root hub lost power or was reset <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.801986] ehci_hcd 0000:00:1d.7: setting latency timer to 64 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802005] usb usb8: root hub lost power or was reset <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802037] pci 0000:02:00.0: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802055] HDA Intel 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802058] pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802059] usb usb2: root hub lost power or was reset <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802067] pci 0000:00:1c.3: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802115] pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802120] pci 0000:00:1c.4: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802128] pci 0000:00:1c.5: PCI INT B -> GSI 16 (level, low) -> IRQ 16 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802135] pci 0000:00:1c.5: setting latency timer to 64 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.802146] pci 0000:00:1e.0: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805537] ata_piix 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805541] ata_piix 0000:00:1f.2: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805583] ata_piix 0000:00:1f.5: PCI INT B -> GSI 19 (level, low) -> IRQ 19 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805587] ata_piix 0000:00:1f.5: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805600] ahci 0000:04:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805606] ahci 0000:04:00.0: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805674] pata_jmicron 0000:04:00.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805686] pata_jmicron 0000:04:00.1: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805705] pata_jmicron 0000:05:00.1: PCI INT B -> GSI 16 (level, low) -> IRQ 16 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805711] pata_jmicron 0000:05:00.1: setting latency timer to 64 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805733] ahci 0000:05:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805738] ahci 0000:05:00.0: setting latency timer to 64 <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805766] ehci_hcd 0000:00:1a.7: cache line size of 32 is not supported <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805944] ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805957] sd 4:0:0:0: [sda] Starting disk <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.805991] sd 4:0:1:0: [sdb] Starting disk <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.806069] sd 5:0:0:0: [sdc] Starting disk <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.806101] sd 5:0:1:0: [sdd] Starting disk <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.809044] r8169 0000:07:00.0: eth1: link down <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.809161] r8169 0000:06:00.0: eth0: link down <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.811801] serial 00:06: activated <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.813692] i8042 kbd 00:08: wake-up capability disabled by ACPI <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 110.856107] firewire_core: skipped bus generations, destroying all nodes <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.110060] ata2: SATA link down (SStatus 0 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.110065] ata4: SATA link down (SStatus 0 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.110108] ata3: SATA link down (SStatus 0 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.110110] ata1: SATA link down (SStatus 0 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.124687] ata7: SATA link down (SStatus 0 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.275070] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.278174] ata8.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.278177] ata8.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.278179] ata8.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.284165] ata8.00: configured for UDMA/100 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.291033] usb 3-1: reset low speed USB device using uhci_hcd and address 2 <0.5> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.357052] firewire_core: rediscovered device fw0 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.572070] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.572081] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.574085] ata5.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.574096] ata5.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.575129] ata6.01: ACPI cmd ef/03:45:00:00:00:b0 (SET FEATURES) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.575131] ata6.01: ACPI cmd ef/03:0c:00:00:00:b0 (SET FEATURES) filtered out <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.575317] ata6.01: ACPI cmd c6/00:10:00:00:00:b0 (SET MULTIPLE MODE) succeeded <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.575320] ata6.01: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.577167] ata5.01: ACPI cmd ef/03:45:00:00:00:b0 (SET FEATURES) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.577169] ata5.01: ACPI cmd ef/03:0c:00:00:00:b0 (SET FEATURES) filtered out <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.577337] ata5.01: ACPI cmd c6/00:10:00:00:00:b0 (SET MULTIPLE MODE) succeeded <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.577339] ata5.01: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.581129] ata6.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.581131] ata6.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.581303] ata6.00: ACPI cmd c6/00:10:00:00:00:a0 (SET MULTIPLE MODE) succeeded <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.581305] ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.583133] ata5.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.583135] ata5.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.583305] ata5.00: ACPI cmd c6/00:10:00:00:00:a0 (SET MULTIPLE MODE) succeeded <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.583307] ata5.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.590308] ata6.00: configured for UDMA/133 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.592316] ata5.00: configured for UDMA/133 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.596321] ata6.01: configured for UDMA/133 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.598309] ata5.01: configured for UDMA/133 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 111.859534] usb 2-5: reset high speed USB device using ehci_hcd and address 2 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 112.226543] usb 8-2: reset low speed USB device using uhci_hcd and address 2 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 112.461534] r8169 0000:06:00.0: eth0: link up <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 112.578476] usb 2-5.4: reset high speed USB device using ehci_hcd and address 4 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 113.239961] snd-usb-audio 2-5.4:1.2: no reset_resume for driver snd-usb-audio? <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 113.239963] snd-usb-audio 2-5.4:1.3: no reset_resume for driver snd-usb-audio? <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 113.241832] PM: restore of devices complete after 2440.211 msecs <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 113.426043] ALSA sound/usb/clock.c:233: current rate 7630447 is different from the runtime rate 8000 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 113.604781] ALSA sound/usb/clock.c:233: current rate 7238995 is different from the runtime rate 11025 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 113.783391] ALSA sound/usb/clock.c:233: current rate 6647919 is different from the runtime rate 16000 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 113.962127] ALSA sound/usb/clock.c:233: current rate 7286387 is different from the runtime rate 22050 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 114.140863] ALSA sound/usb/clock.c:233: current rate 6447461 is different from the runtime rate 24000 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 114.319599] ALSA sound/usb/clock.c:233: current rate 6370404 is different from the runtime rate 32000 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 114.498336] ALSA sound/usb/clock.c:233: current rate 7282791 is different from the runtime rate 44100 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 114.677198] ALSA sound/usb/clock.c:233: current rate 7368494 is different from the runtime rate 48000 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 115.372435] I was here: 1.3 <0.6> 2012-06-30 02:29:31 Telcontar kernel - - - [ 115.375907] I was here: 1.4 <0.4> 2012-06-30 02:29:31 Telcontar kernel - - - [ 115.411016] Restarting tasks ... done. <0.7> 2012-06-30 02:29:31 Telcontar kernel - - - [ 115.458959] PM: Basic memory bitmaps freed <5.4> 2012-06-30 02:29:32 Telcontar pm-utils - - - Thawing (95)... <0.6> 2012-06-30 02:29:32 Telcontar kernel - - - [ 116.274092] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input6 <0.6> 2012-06-30 02:29:32 Telcontar kernel - - - [ 116.274189] ACPI: Power Button [PWRB] <0.6> 2012-06-30 02:29:32 Telcontar kernel - - - [ 116.274233] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input7 <0.6> 2012-06-30 02:29:32 Telcontar kernel - - - [ 116.274279] ACPI: Power Button [PWRF] <5.4> 2012-06-30 02:29:32 Telcontar pm-utils - - - Thawing (1)... - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/uTp0ACgkQtTMYHG2NR9XLhwCfbeCqtsKT/C2ouxcpiW1p9DDP YsgAnien0557bMe3TffMuePif72kk1c4 =L47L -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

01:35

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-30 02:55, Carlos E. R. wrote:

...

I rebooted that kernel, and tried one hibernation, sucessfull. I saw a lot of messages going by that I had never seen - those could be captured with a serial port, I suppose.

I tried again - the messages can be stopped with ctrl-s, and read at will. pg-up/dn works, continue with ctrl-q. I can thus make a photo if needed, and anyway, those are the working time messages, I need those when it fails. Or perhaps I need both, I'll make a photo then. Using the serial port is possible but a nuisance. I need to connect my old computer, free space in the table for its keyboard, remember to power up the old machine first, and when fully running boot the main machine. And this for several days, till it crashes and I get the data, supposedly. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/uV9UACgkQIvFNjefEBxrELwCg0urzQl7q6LdhP39SqAByl/y/ eEwAoNsMb+/U/f/5apGwrRWv4YfSvokj =voxl -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

David Haller

02:52

Hello, On Sat, 30 Jun 2012, Carlos E. R. wrote:

...

On Saturday, 2012-06-30 at 02:04 +0200, Carlos E. R. wrote:

...
I'm now compiling a kernel based on the desktop variant with some printk messages in power/user.c.

I installed that kernel.

It uses 1.7 gigabytes in /lib/modules, and the sources expanded to 7 GiB. I don't know why is that. Debug info? (see below).

Probably. A normal -default kernel (3.1.10) uses 139M in /lib/modules. Check it with e.g.: file $(/sbin/modinfo -F filename cdrom) (works as user too). If the output contains the words 'not stripped', then debug stuff is still in there. Weird. My cdrom.ko and btrfs.ko are also not stripped. Have you checked with du -hs /lib/modules/$(uname -r) as your's seems one order of magnitude too large? A built kernel-tree does take up a _LOT_ of space, but not that much ;)

...

I rebooted that kernel, and tried one hibernation, sucessfull. I saw a lot of messages going by that I had never seen - those could be captured with a serial port, I suppose.

On thawing I see some of those, I'm not sure. I do see some of my new printk statements (I was here: x.y)

Great :)

...

These are the messagges in the log - they are saved to file recovered after thawing, except those printed by s2disk which are lost. I don't know the exact point where thawing starts in the log:

I'm guessing ...[1] [..] [ 110.397128] CPU 3 is now offline [2] ... or here. [ 110.397129] SMP alternatives: switching to UP code [ 110.401909] PM: Creating hibernation image: [ 110.402011] PM: Need to copy 290653 pages [ 110.402011] PM: Normal pages needed: 290653 + 1024, available pages: 1806127 [1] HERE ...[2] [ 110.402011] microcode: CPU0 updated to revision 0xa0b, date = 2010-09-28 [ 110.402011] Enabling non-boot CPUs ... [ 110.403029] SMP alternatives: switching to SMP code [ 110.407496] Booting Node 0 Processor 1 APIC 0x1 [ 110.510229] NMI watchdog enabled, takes one hw-pmu counter. [ 110.511260] microcode: CPU1 updated to revision 0xa0b, date = 2010-09-28 [ 110.517024] CPU1 is up [ 110.517203] Booting Node 0 Processor 2 APIC 0x2 [ 110.620330] NMI watchdog enabled, takes one hw-pmu counter. [ 110.621521] microcode: CPU2 updated to revision 0xa0b, date = 2010-09-28 [ 110.626044] CPU2 is up [ 110.626200] Booting Node 0 Processor 3 APIC 0x3 [ 110.729428] NMI watchdog enabled, takes one hw-pmu counter. [ 110.730586] microcode: CPU3 updated to revision 0xa0b, date = 2010-09-28 [ 110.735021] CPU3 is up [ 110.737442] ata_piix 0000:00:1f.2: restoring config space at offset 0x1 (was 0x2b00003, writing 0x2b00007) I'd have to look where that "switching to UP code" is ;) Waiting what you get when it gets stuck. BTW: have you had a look at the 'smartctl -A' output for the device(s) where your swap is on? -dnh -- "Where's the kaboom? There was supposed to be an Earth-shattering kaboom!" -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

10:58

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-30 04:52, David Haller wrote:

...

...
It uses 1.7 gigabytes in /lib/modules, and the sources expanded to 7 GiB. I don't know why is that. Debug info? (see below).

Probably. A normal -default kernel (3.1.10) uses 139M in /lib/modules. Check it with e.g.:

file $(/sbin/modinfo -F filename cdrom)

(works as user too). If the output contains the words 'not stripped', then debug stuff is still in there. Weird.

That is so, non stripped. cer@Telcontar:~> file $(/sbin/modinfo -F filename cdrom) /lib/modules/2.6.37.6-0.20-desktop-cer/kernel/drivers/cdrom/cdrom.ko: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

...

My cdrom.ko and btrfs.ko are also not stripped. Have you checked with du -hs /lib/modules/$(uname -r) as your's seems one order of magnitude too large?

Heh! :-) Look: cer@Telcontar:~> file $(/sbin/modinfo -F filename cdrom) /lib/modules/2.6.37.6-0.20-desktop-cer/kernel/drivers/cdrom/cdrom.ko: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped cer@Telcontar:~> du -hs /lib/modules/$(uname -r) 1,8G /lib/modules/2.6.37.6-0.20-desktop-cer cer@Telcontar:~> du -hs /usr/src/linux 0 /usr/src/linux cer@Telcontar:~> du -hs /usr/src/linux/ 7,1G /usr/src/linux/ cer@Telcontar:~> du -hs /usr/src/linux/ 22 minutes to build, on a fast quad core2 machine (Q9550 @ 2.83GHz)

...

A built kernel-tree does take up a _LOT_ of space, but not that much ;)

It does for me and others. I have not investigated. I do: #!/bin/bash echo "-- cloneconfig --" && make cloneconfig && echo "-- scripts --" && make scripts && echo "-- prepare --" && make prepare && echo "-- Done good! --" then make menuconfig, change the kernel name to -desktop-cer to use a different modules tree, then: echo "-- hacer --" && make --jobs=4 && echo "-- modulos --" && make modules_install && echo "-- install --" && make install && echo "-- Done good! --"

...

I'd have to look where that "switching to UP code" is ;) Waiting what you get when it gets stuck.

It got stuck today, but on restore - that is not usual, but I made a photo. I can not post it now, I have to go shopping two things before they close.

...

BTW: have you had a look at the 'smartctl -A' output for the device(s) where your swap is on?

Of course, one of first things I did. Did the long test too, no problems. I also reformatted the swap partition. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/u29gACgkQIvFNjefEBxqCFwCbBadxVbCmMZD4rU/r9x11MW6I OnsAniTh+HuCi3Z894fqOyHCAcvAF/Ej =3WIV -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

17:14

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-30 12:58, Carlos E. R. wrote:

...

It got stuck today, but on restore - that is not usual, but I made a photo. I can not post it now, I have to go shopping two things before they close.

Here goes: http://susepaste.org/16257523 I think it is messages from the restore, mixed with messages from the hibernation that are recovered later. The last messages before lock are: I was here: 2.8 I was here: 2.9 I was here: 1.3 I was here: 1.4 The messages are from - /usr/src/linux/kernel/power/hibernate.c: printk(KERN_INFO "I was here: 2.8\n"); resume_console(); Close: printk(KERN_INFO "I was here: 2.9\n"); platform_end(platform_mode); return error; and /usr/src/linux/kernel/power/user.c case SNAPSHOT_CREATE_IMAGE: printk(KERN_INFO "I was here: 1.0\n"); if (data->mode != O_RDONLY || !data->frozen || data->ready) { error = -EPERM; break; } printk(KERN_INFO "I was here: 1.1\n"); pm_restore_gfp_mask(); printk(KERN_INFO "I was here: 1.2\n"); error = hibernation_snapshot(data->platform_support); printk(KERN_INFO "I was here: 1.3\n"); if (!error) error = put_user(in_suspend, (int __user *)arg); if (!error) data->ready = 1; printk(KERN_INFO "I was here: 1.4\n"); break; printk(KERN_INFO "I was here: 1.2\n"); error = hibernation_snapshot(data->platform_support); printk(KERN_INFO "I was here: 1.3\n"); if (!error) error = put_user(in_suspend, (int __user *)arg); if (!error) data->ready = 1; printk(KERN_INFO "I was here: 1.4\n"); break; I probably need now to add messages to the restore phases... :-? - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/vNBAACgkQIvFNjefEBxrDIgCgwdq723IuXxAUQ09MLf8DZAkp nzgAoKzUsMqf7EPBxEbRhUXWcNVraD2y =/0i2 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Malcolm

17:32

New subject: [opensuse] Re: Need help: I found out where s2disk crashes.

On Sat, 30 Jun 2012 19:14:56 +0200 "Carlos E. R." <robin.listas@telefonica.net> wrote:

...

http://susepaste.org/16257523

Hi You can use fbgrab from fbcat to capture a tty.... eg fbgrab -C 10 tty10.png -- Cheers Malcolm °¿° (Linux Counter #276890) SUSE Linux Enterprise Desktop 11 (x86_64) Kernel 3.0.34-0.7-default up 18:31, 2 users, load average: 0.27, 0.40, 0.40 CPU Intel i5 CPU M520@2.40GHz | Intel Arrandale GPU -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

17:46

New subject: [opensuse] Re: Need help: I found out where s2disk crashes.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-30 19:32, Malcolm wrote:

...

On Sat, 30 Jun 2012 19:14:56 +0200 "Carlos E. R." <> wrote:

...

Hi You can use fbgrab from fbcat to capture a tty.... eg

fbgrab -C 10 tty10.png

On a crashed machine? :-o - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/vO2YACgkQIvFNjefEBxpoPgCgm9/CZZe+tX+E3Q5XAnN8/jM2 M1sAoKAT4qRvp/Dq0563qtiNMud4VLwO =ZXER -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Malcolm

18:40

New subject: [opensuse] Re: Need help: I found out where s2disk crashes.

On Sat, 30 Jun 2012 19:46:14 +0200 "Carlos E. R." <robin.listas@telefonica.net> wrote:

...

On 2012-06-30 19:32, Malcolm wrote:

...
On Sat, 30 Jun 2012 19:14:56 +0200 "Carlos E. R." <> wrote:

...
Hi You can use fbgrab from fbcat to capture a tty.... eg

fbgrab -C 10 tty10.png

On a crashed machine? :-o

- -- Cheers / Saludos,

Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar)

Hi No, add it into your code so it executes....? -- Cheers Malcolm °¿° (Linux Counter #276890) SUSE Linux Enterprise Desktop 11 (x86_64) Kernel 3.0.34-0.7-default up 19:41, 2 users, load average: 0.72, 0.57, 0.57 CPU Intel i5 CPU M520@2.40GHz | Intel Arrandale GPU -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

21:11

New subject: [opensuse] Re: Need help: I found out where s2disk crashes.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Saturday, 2012-06-30 at 13:40 -0500, Malcolm wrote:

...

...
...
fbgrab -C 10 tty10.png

On a crashed machine? :-o

Hi No, add it into your code so it executes....?

That's way beyond my capabilities. It has to be inserted inside the kernel, and writing to disk is forbidden beyond certain point in the hibernation process - and that's the part of the log that is interesting. - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/va3sACgkQtTMYHG2NR9X2HACcC5fres8LXKF6xly0SzlSh4WC bosAnRrNAu0QHLhB7v2DUr+FXF2rdFD1 =/EKN -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

2 Jul 2 Jul

11:01

New subject: [opensuse] Need help: I found out where s2disk crashes - no_console_suspend?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Saturday, 2012-06-30 at 04:52 +0200, David Haller wrote:

...

I'd have to look where that "switching to UP code" is ;) Waiting what you get when it gets stuck.

I finally got a crash... and absolutely no data, black screen :-(( At some point this is printed: I was here: 2.2 Suspending console(s) (use no_console_suspend to debug) And there is a video mode change or something so that the screen is cleared, and messages start to flow again. It got stuck precisely after the screen was cleared and nothing could be read from the screen. Now I need to know where I enable that "no_console_suspend" And this is hapening with the nouveau driver. No tainting. - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/xf4IACgkQtTMYHG2NR9U5tgCbBy+bq5QD6XoSitMuF/9zNQ9l Sb0An3AiDV4/YpIq11jkLdQAZfu59lup =PgVb -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

11:28

New subject: [opensuse] Need help: I found out where s2disk crashes - no_console_suspend?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-02 13:01, Carlos E. R. wrote:

...

Now I need to know where I enable that "no_console_suspend"

Found it here: https://wiki.ubuntu.com/Kernel/KernelDebuggingTricks - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/xhfgACgkQIvFNjefEBxodeACfTeINPRKbqY174eyAKNpkkXhl UdgAnRrMhSaJI+Tm+AfhE8ezDEmpydov =MgBi -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

3 Jul 3 Jul

11:11

New subject: [opensuse] Need help: I found out where s2disk crashes - no_console_suspend?

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-02 13:01, Carlos E. R. wrote:

...

I finally got a crash... and absolutely no data, black screen :-((

And another, on restore, with photos taken and posted to the bugzilla. Still no answer there. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/y00kACgkQIvFNjefEBxrrOACeNcLEbpfJw65QA0HjLSRyxAcm U5AAoM+Q8OdplgCkFVlvP8KgRWaZUlK5 =PFaC -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

6 Jul 6 Jul

09:28

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Saturday, 2012-06-30 at 04:52 +0200, David Haller wrote:

...

On Sat, 30 Jun 2012, Carlos E. R. wrote:

...

...
On thawing I see some of those, I'm not sure. I do see some of my new printk statements (I was here: x.y)

Great :)

...

I'd have to look where that "switching to UP code" is ;) Waiting what you get when it gets stuck.

Well, that part is done, I got a crash on video. I could not upload it, though: it is 30 megs and bugzilla has a limit of 10 :-/ I uploaded the last photo instead. https://bugzilla.novell.com/show_bug.cgi?id=765084#c22 https://bugzilla.novell.com/attachment.cgi?id=497611 It displays my "I was here: 2.4", then a few more kernel messages about ata mostly. The "I was here: 2.5" that I see in a normal hibernation is missing from the output. The last line is something about the ethernet card having the link up (the video is as best as I can make, with a tripod for stability, but not enough resolution). Now what, wait for a year for someone to respond to the bugzilla when it is obsolete and they tell me to upgrade :-/ - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/2r7AACgkQtTMYHG2NR9VXHgCfWpd30l5ioZZHdCZVfUpyfL8L W04AnisIPLo046ChoKtWQB+n4CIEegbm =C1l7 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Dave Howorth

10:57

Carlos E. R. wrote:

...

Well, that part is done, I got a crash on video. I could not upload it, though: it is 30 megs and bugzilla has a limit of 10 :-/

I uploaded the last photo instead.

https://bugzilla.novell.com/show_bug.cgi?id=765084#c22 https://bugzilla.novell.com/attachment.cgi?id=497611

FWIW, that photo is pretty much completely illegible. I don't know if it was that way when you uploaded it, or bugzilla has made it so, or my browser has done the evil deed. Is a photo the only way to access the console output? Can you upload the photo and/or video anywhere else?

...

It displays my "I was here: 2.4", then a few more kernel messages about ata mostly. The "I was here: 2.5" that I see in a normal hibernation is missing from the output.

printk(KERN_INFO "I was here: 2.4\n"); error = dpm_suspend_start(PMSG_FREEZE); if (error) goto Recover_platform; printk(KERN_INFO "I was here: 2.5\n"); Presumably you don't see 2.10? So it calls dpm_suspend_start(PMSG_FREEZE) and doesn't return? So what is inside dpm_suspend_start() and where does that fail? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Per Jessen

12:21

Dave Howorth wrote:

...

Carlos E. R. wrote:

...
Well, that part is done, I got a crash on video. I could not upload it, though: it is 30 megs and bugzilla has a limit of 10 :-/

I uploaded the last photo instead.

https://bugzilla.novell.com/show_bug.cgi?id=765084#c22 https://bugzilla.novell.com/attachment.cgi?id=497611

FWIW, that photo is pretty much completely illegible. I don't know if it was that way when you uploaded it, or bugzilla has made it so, or my browser has done the evil deed.

The image is only 640x480 - if enlarge it in GIMP, it becomes a little more readable, but a higher resolution would perhaps help.

...

Is a photo the only way to access the console output?

It seems like it ought to be possible to capture the output with a serial console and minicom? -- Per Jessen, Zürich (20.1°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

13:16

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-06 14:21, Per Jessen wrote:

...

Dave Howorth wrote:

...

The image is only 640x480 - if enlarge it in GIMP, it becomes a little more readable, but a higher resolution would perhaps help.

It is the most that the video camera can make. In the small display of the camera it seemed alright, but it wasn't till I uploaded it to the computer (after booting it) that I could see the real quality. I can wait till the next crash and make a photo instead. I made a video because there is a moment that the computer prints something and it blinks and redraws a different screen, so it is difficult to catch. I'll try find dpm_suspend_start(). [...] Found it, in /usr/src/linux-2.6.37.6-0.20/drivers/base/power/main.c: /** * dpm_suspend_start - Prepare devices for PM transition and suspend them. * @state: PM transition of the system being carried out. * * Prepare all non-sysdev devices for system PM transition and execute "suspend" * callbacks for them. */ int dpm_suspend_start(pm_message_t state) { int error; might_sleep(); error = dpm_prepare(state); if (!error) error = dpm_suspend(state); return error; } EXPORT_SYMBOL_GPL(dpm_suspend_start);

...

...
Is a photo the only way to access the console output?

It seems like it ought to be possible to capture the output with a serial console and minicom?

It is quite complicated, having it active for a week. If there is no other method I'll try. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/25R0ACgkQIvFNjefEBxpUCwCfS6uuCkkhwbTx/3xPng+NH4+z WOQAoMxHHtYSTBBGQYYMrc7WQEdIjNed =TMZm -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Dave Howorth

13:57

Carlos E. R. wrote:

...

I'll try find dpm_suspend_start(). [...] Found it, in /usr/src/linux-2.6.37.6-0.20/drivers/base/power/main.c:

/** * dpm_suspend_start - Prepare devices for PM transition and suspend them. * @state: PM transition of the system being carried out. * * Prepare all non-sysdev devices for system PM transition and execute "suspend" * callbacks for them. */ int dpm_suspend_start(pm_message_t state) { int error;

might_sleep(); error = dpm_prepare(state); if (!error) error = dpm_suspend(state); return error; } EXPORT_SYMBOL_GPL(dpm_suspend_start);

Ugh! Yet more function calls. Yet more printk I suppose :( -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

19:01

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday, 2012-07-06 at 14:57 +0100, Dave Howorth wrote:

...

Ugh! Yet more function calls.

Yet more printk I suppose :(

Yes... I compiled them already, a few there and a few more in dpm_prepare(). /** * dpm_prepare - Prepare all non-sysdev devices for a system PM transition. * @state: PM transition of the system being carried out. * * Execute the ->prepare() callback(s) for all devices. */ static int dpm_prepare(pm_message_t state) { struct list_head list; int error = 0; printk(KERN_INFO "I was here: 5.0\n"); INIT_LIST_HEAD(&list); printk(KERN_INFO "I was here: 5.1\n"); mutex_lock(&dpm_list_mtx); printk(KERN_INFO "I was here: 5.2\n"); transition_started = true; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); get_device(dev); dev->power.status = DPM_PREPARING; mutex_unlock(&dpm_list_mtx); pm_runtime_get_noresume(dev); if (pm_runtime_barrier(dev) && device_may_wakeup(dev)) { /* Wake-up requested during system sleep transition. */ pm_runtime_put_sync(dev); error = -EBUSY; } else { error = device_prepare(dev, state); } mutex_lock(&dpm_list_mtx); if (error) { dev->power.status = DPM_ON; if (error == -EAGAIN) { put_device(dev); error = 0; continue; } printk(KERN_ERR "PM: Failed to prepare device %s " "for power transition: error %d\n", kobject_name(&dev->kobj), error); put_device(dev); break; } dev->power.status = DPM_SUSPENDING; if (!list_empty(&dev->power.entry)) list_move_tail(&dev->power.entry, &list); put_device(dev); } printk(KERN_INFO "I was here: 5.10\n"); list_splice(&list, &dpm_list); printk(KERN_INFO "I was here: 5.11\n"); mutex_unlock(&dpm_list_mtx); return error; } /** * dpm_suspend_start - Prepare devices for PM transition and suspend them. * @state: PM transition of the system being carried out. * * Prepare all non-sysdev devices for system PM transition and execute "suspend" * callbacks for them. */ int dpm_suspend_start(pm_message_t state) { int error; printk(KERN_INFO "I was here: 4.0\n"); might_sleep(); printk(KERN_INFO "I was here: 4.1\n"); error = dpm_prepare(state); printk(KERN_INFO "I was here: 4.2\n"); if (!error) error = dpm_suspend(state); printk(KERN_INFO "I was here: 4.3\n"); return error; } EXPORT_SYMBOL_GPL(dpm_suspend_start); I haven't hibernated yet... I'll try now. [...] Done two cycles, no crash. Got two photos for comparison to the logs. I will also mark in this sucessfull log the point of failure in the failed session I posted earlier. http://susepaste.org/70504465 http://susepaste.org/25591075 <5.4> 2012-07-06 19:51:27 Telcontar pm-utils - - - Hibernating (1)... <5.4> 2012-07-06 19:51:29 Telcontar pm-utils - - - Hibernating (95)... These are markers I put in the scripts. <5.4> 2012-07-06 19:51:33 Telcontar router - - - (Thawing 1) Logging the current IP= 83.37.57.120 <0.7> 2012-07-06 19:51:34 Telcontar kernel - - - [ 787.702275] PM: Marking nosave pages: 000000000009f000 - 0000000000100000 <0.7> 2012-07-06 19:51:34 Telcontar kernel - - - [ 787.702280] PM: Marking nosave pages: 00000000bff90000 - 0000000100000000 <0.7> 2012-07-06 19:51:34 Telcontar kernel - - - [ 787.702907] PM: Basic memory bitmaps created <0.4> 2012-07-06 19:51:35 Telcontar kernel - - - [ 788.398380] Syncing filesystems ... done. <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.014785] Freezing user space processes ... (elapsed 0.01 seconds) done. <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.027339] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.039878] I was here: 1.0 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.040392] I was here: 1.1 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.040908] I was here: 1.2 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.041424] I was here: 2.0 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.041927] I was here: 2.1 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.042427] PM: Preallocating image memory... done (allocated 718651 pages) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.321245] PM: Allocated 2874604 kbytes in 0.27 seconds (10646.68 MB/s) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.321730] I was here: 2.2 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.322233] I was here: 2.3 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.322720] I was here: 2.4 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.323203] I was here: 4.0 <== dpm_suspend_start() <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.323679] I was here: 4.1 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.324161] I was here: 5.0 <== dpm_prepare() <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.324637] I was here: 5.1 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.325111] I was here: 5.2 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.325877] I was here: 5.10 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.326094] I was here: 5.11 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.326519] I was here: 4.2 <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.327314] sd 5:0:1:0: [sdd] Synchronizing SCSI cache <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.327333] sd 5:0:0:0: [sdc] Synchronizing SCSI cache <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.327428] sd 4:0:1:0: [sdb] Synchronizing SCSI cache <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.327437] sd 4:0:0:0: [sda] Synchronizing SCSI cache <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.327641] i8042 kbd 00:08: wake-up capability enabled by ACPI <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.327833] serial 00:06: disabled Is that the serial port being disabled? <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328067] HDA Intel 0000:03:00.0: PCI INT A disabled <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328069] pata_jmicron 0000:05:00.1: PCI INT B disabled <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328079] [drm] nouveau 0000:01:00.0: Disabling fbcon acceleration... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328081] [drm] nouveau 0000:01:00.0: Unpinning framebuffer(s)... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328120] [drm] nouveau 0000:01:00.0: Evicting buffers... <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328125] ACPI handle has no context! <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328268] ahci 0000:04:00.0: PCI INT A disabled <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328392] ata_piix 0000:00:1f.5: PCI INT B disabled <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328524] ahci 0000:05:00.0: PCI INT A disabled <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.328619] pata_jmicron 0000:04:00.1: PCI INT B disabled <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.329045] ACPI handle has no context! <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.350337] [drm] nouveau 0000:01:00.0: Idling channels... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.350521] [drm] nouveau 0000:01:00.0: Suspending GPU objects... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.453240] ata_piix 0000:00:1f.2: PCI INT B disabled <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.794931] [drm] nouveau 0000:01:00.0: And we're gone! At this point, as can be seen in a video, the process stops for about 3 seconds. Then the display blinks, and lines start appearing again, but not from that exact point, but earlier (around 4.2). The line above "And we're gone!" is in the photo. Then it starts with things about the ata, which doesn't show in the log below. Instead of the "4.3" point, the photo shows "2.0, 2.9, 1.3, 1.4, and back to s2disk (steps 1.3) Suspicion: the next part of the log might be from restore cycle, in the middle of messages of the hibernation cycle. Section not in the photo. <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.796287] PM: freeze of devices complete after 469.327 msecs <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.796290] I was here: 4.3 <=== end of "dpm_suspend_start()". <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.796292] I was here: 2.5 <=== hibernation_snapshot() <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.796851] PM: late freeze of devices complete after 0.557 msecs <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.796854] Disabling non-boot CPUs ... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.810385] CPU 1 is now offline <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.826253] CPU 2 is now offline <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.842112] CPU 3 is now offline <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.842116] SMP alternatives: switching to UP code <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.846997] PM: Creating hibernation image: <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.847000] PM: Need to copy 718329 pages <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.847000] PM: Normal pages needed: 718329 + 1024, available pages: 1378451 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.847000] microcode: CPU0 updated to revision 0xa0b, date = 2010-09-28 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.847000] Enabling non-boot CPUs ... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.848017] SMP alternatives: switching to SMP code <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.852554] Booting Node 0 Processor 1 APIC 0x1 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.955229] NMI watchdog enabled, takes one hw-pmu counter. <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.956266] microcode: CPU1 updated to revision 0xa0b, date = 2010-09-28 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.962017] CPU1 is up <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 789.962197] Booting Node 0 Processor 2 APIC 0x2 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.065373] NMI watchdog enabled, takes one hw-pmu counter. <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.066467] microcode: CPU2 updated to revision 0xa0b, date = 2010-09-28 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.071055] CPU2 is up <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.071241] Booting Node 0 Processor 3 APIC 0x3 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.174467] NMI watchdog enabled, takes one hw-pmu counter. <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.175554] microcode: CPU3 updated to revision 0xa0b, date = 2010-09-28 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.180016] CPU3 is up <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182044] pci 0000:00:00.0: restoring config space at offset 0x1 (was 0x20900006, writing 0x900006) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182457] ata_piix 0000:00:1f.2: restoring config space at offset 0x1 (was 0x2b00003, writing 0x2b00007) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182496] ata_piix 0000:00:1f.5: restoring config space at offset 0x1 (was 0x2b00003, writing 0x2b00007) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182628] HDA Intel 0000:03:00.0: restoring config space at offset 0x1 (was 0x2900006, writing 0x2900002) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182680] ahci 0000:04:00.0: restoring config space at offset 0x1 (was 0x100003, writing 0x100007) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182736] pata_jmicron 0000:04:00.1: restoring config space at offset 0x1 (was 0x100001, writing 0x100005) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182787] ahci 0000:05:00.0: restoring config space at offset 0x1 (was 0x100003, writing 0x100007) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182842] pata_jmicron 0000:05:00.1: restoring config space at offset 0x1 (was 0x100001, writing 0x100005) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182890] r8169 0000:06:00.0: restoring config space at offset 0x1 (was 0x100007, writing 0x100407) <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.182959] r8169 0000:07:00.0: restoring config space at offset 0x1 (was 0x100007, writing 0x100407) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.183162] PM: early restore of devices complete after 1.176 msecs <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.183186] I was here: 2.6 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308754] I was here: 2.7 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308923] uhci_hcd 0000:00:1a.0: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308927] pci 0000:00:06.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308930] pci 0000:00:06.0: setting latency timer to 64 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308936] uhci_hcd 0000:00:1a.1: setting latency timer to 64 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308976] usb usb4: root hub lost power or was reset <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308978] usb usb3: root hub lost power or was reset <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308992] uhci_hcd 0000:00:1a.2: setting latency timer to 64 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.308994] ehci_hcd 0000:00:1a.7: setting latency timer to 64 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309027] usb usb5: root hub lost power or was reset <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309045] pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309051] pci 0000:00:1c.0: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309058] pci 0000:00:1c.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309064] pci 0000:00:1c.2: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309071] pci 0000:00:1c.3: PCI INT D -> GSI 19 (level, low) -> IRQ 19 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309073] usb usb1: root hub lost power or was reset <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309079] pci 0000:00:1c.3: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309087] pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309092] pci 0000:00:1c.4: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309099] pci 0000:00:1c.5: PCI INT B -> GSI 16 (level, low) -> IRQ 16 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309104] pci 0000:00:1c.5: setting latency timer to 64 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309112] uhci_hcd 0000:00:1d.0: setting latency timer to 64 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309152] uhci_hcd 0000:00:1d.1: setting latency timer to 64 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309153] usb usb6: root hub lost power or was reset <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309167] uhci_hcd 0000:00:1d.2: setting latency timer to 64 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309203] usb usb7: root hub lost power or was reset <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309217] usb usb8: root hub lost power or was reset <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309225] ehci_hcd 0000:00:1d.7: setting latency timer to 64 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309235] pci 0000:00:1e.0: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309247] ata_piix 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309253] ata_piix 0000:00:1f.2: setting latency timer to 64 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309255] usb usb2: root hub lost power or was reset <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309323] ata_piix 0000:00:1f.5: PCI INT B -> GSI 19 (level, low) -> IRQ 19 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309330] ata_piix 0000:00:1f.5: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309343] [drm] nouveau 0000:01:00.0: We're back, enabling device... <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309352] nouveau 0000:01:00.0: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309357] [drm] nouveau 0000:01:00.0: POSTing device... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.309360] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 0 at offset 0xD2F4 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310048] pci 0000:02:00.0: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310091] HDA Intel 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310125] ahci 0000:04:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310135] ahci 0000:04:00.0: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310150] pata_jmicron 0000:04:00.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310157] pata_jmicron 0000:04:00.1: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310174] ahci 0000:05:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310181] ahci 0000:05:00.0: setting latency timer to 64 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310197] pata_jmicron 0000:05:00.1: PCI INT B -> GSI 16 (level, low) -> IRQ 16 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.310205] pata_jmicron 0000:05:00.1: setting latency timer to 64 <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.313127] ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.313278] sd 4:0:0:0: [sda] Starting disk <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.313316] sd 4:0:1:0: [sdb] Starting disk <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.313331] sd 5:0:0:0: [sdc] Starting disk <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.313345] sd 5:0:1:0: [sdd] Starting disk <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.314012] ehci_hcd 0000:00:1a.7: cache line size of 32 is not supported <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.314012] ehci_hcd 0000:00:1a.7: cache line size of 32 is not supported <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.315916] serial 00:06: activated <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.317319] i8042 kbd 00:08: wake-up capability disabled by ACPI <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.319053] r8169 0000:06:00.0: eth0: link down <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.319085] r8169 0000:07:00.0: eth1: link down <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.338135] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 1 at offset 0xD741 <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.366075] firewire_core: skipped bus generations, destroying all nodes <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.398021] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 2 at offset 0xE791 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.398049] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 3 at offset 0xE8B2 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.400764] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 4 at offset 0xEB22 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.400768] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table at offset 0xEB87 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.421071] [drm] nouveau 0000:01:00.0: Reinitialising engines... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.422786] [drm] nouveau 0000:01:00.0: Restoring GPU objects... <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.437704] [drm] nouveau 0000:01:00.0: Restoring mode... The photo continues from here, next to the " And we're gone!" message above. But messages from hibernation and thawing can bot be mixed in the written log, which is confusing. (You can also compare the log below here to the bad quality photo: it can not be read, but lines can be identified as the same.) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.618039] ata3: SATA link down (SStatus 0 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.618285] ata1: SATA link down (SStatus 0 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.618723] ata2: SATA link down (SStatus 0 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.619198] ata4: SATA link down (SStatus 0 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.632642] ata7: SATA link down (SStatus 0 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.772051] ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.791125] ata8.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.791327] ata8.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.791769] ata8.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.798143] ata8.00: configured for UDMA/100 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.801022] usb 3-1: reset low speed USB device using uhci_hcd and address 2 <0.5> 2012-07-06 19:54:07 Telcontar kernel - - - [ 790.867036] firewire_core: rediscovered device fw0 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.076084] ata5.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.076307] ata5.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.079122] ata5.01: ACPI cmd ef/03:45:00:00:00:b0 (SET FEATURES) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.079329] ata5.01: ACPI cmd ef/03:0c:00:00:00:b0 (SET FEATURES) filtered out <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.079951] ata5.01: ACPI cmd c6/00:10:00:00:00:b0 (SET MULTIPLE MODE) succeeded <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.080237] ata5.01: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.086111] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.086326] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.086742] ata5.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.087071] ata5.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.087605] ata5.00: ACPI cmd c6/00:10:00:00:00:a0 (SET MULTIPLE MODE) succeeded <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.087804] ata5.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.089123] ata6.01: ACPI cmd ef/03:45:00:00:00:b0 (SET FEATURES) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.089313] ata6.01: ACPI cmd ef/03:0c:00:00:00:b0 (SET FEATURES) filtered out <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.089765] ata6.01: ACPI cmd c6/00:10:00:00:00:b0 (SET MULTIPLE MODE) succeeded <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.089950] ata6.01: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.096128] ata6.00: ACPI cmd ef/03:45:00:00:00:a0 (SET FEATURES) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.096316] ata6.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.096782] ata6.00: ACPI cmd c6/00:10:00:00:00:a0 (SET MULTIPLE MODE) succeeded <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.096971] ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.097582] ata5.00: configured for UDMA/133 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.103314] ata5.01: configured for UDMA/133 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.106304] ata6.00: configured for UDMA/133 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.112341] ata6.01: configured for UDMA/133 (this is the last but one line in the bad photo of crashed session; next two are missing) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.521080] usb 8-2: reset low speed USB device using uhci_hcd and address 2 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 791.902051] usb 2-5: reset high speed USB device using ehci_hcd and address 2 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.002714] r8169 0000:06:00.0: eth0: link up (and that ^ is the line in the crashed session, the point of failure) <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.291119] usb 2-5.4: reset high speed USB device using ehci_hcd and address 4 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.953016] snd-usb-audio 2-5.4:1.2: no reset_resume for driver snd-usb-audio? <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.953214] snd-usb-audio 2-5.4:1.3: no reset_resume for driver snd-usb-audio? <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 792.955419] PM: restore of devices complete after 2646.655 msecs <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 793.139858] ALSA sound/usb/clock.c:233: current rate 16877 is different from the runtime rate 8000 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 793.318604] ALSA sound/usb/clock.c:233: current rate 16777215 is different from the runtime rate 11025 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 793.497350] ALSA sound/usb/clock.c:233: current rate 3750848 is different from the runtime rate 16000 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 793.676092] ALSA sound/usb/clock.c:233: current rate 0 is different from the runtime rate 22050 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 793.854963] ALSA sound/usb/clock.c:233: current rate 10748160 is different from the runtime rate 24000 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.033708] ALSA sound/usb/clock.c:233: current rate 13 is different from the runtime rate 32000 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.212455] ALSA sound/usb/clock.c:233: current rate 11469056 is different from the runtime rate 44100 <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.391200] ALSA sound/usb/clock.c:233: current rate 11469056 is different from the runtime rate 48000 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.577733] I was here: 2.8 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.577937] I was here: 2.9 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.578371] I was here: 1.3 <0.6> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.578797] I was here: 1.4 here the photo shows ", 2, 4" and step 1.3, 1.4 of s2disk, and the image is saved to disk. The next is from the thawing section. <0.4> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.601890] Restarting tasks ... done. <0.7> 2012-07-06 19:54:07 Telcontar kernel - - - [ 794.667017] PM: Basic memory bitmaps freed <5.4> 2012-07-06 19:54:07 Telcontar pm-utils - - - Thawing (95)... <5.4> 2012-07-06 19:54:07 Telcontar pm-utils - - - Thawing (1)... Another interesting "feature" seen in the photos is that in the second one is missing a section of the left side. This happened since it switched to the nouveau driver (it is noticiably slower, I see artifacts in the xterm as I type this). I think, seeing the log, that I need more printks inside dpm_suspend(). Or maybe not... I'm about lost. /** * dpm_suspend - Execute "suspend" callbacks for all non-sysdev devices. * @state: PM transition of the system being carried out. */ static int dpm_suspend(pm_message_t state) { struct list_head list; ktime_t starttime = ktime_get(); int error = 0; INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); pm_transition = state; async_error = 0; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.prev); get_device(dev); mutex_unlock(&dpm_list_mtx); error = device_suspend(dev); mutex_lock(&dpm_list_mtx); if (error) { pm_dev_err(dev, state, "", error); put_device(dev); break; } if (!list_empty(&dev->power.entry)) list_move(&dev->power.entry, &list); put_device(dev); if (async_error) break; } list_splice(&list, dpm_list.prev); mutex_unlock(&dpm_list_mtx); async_synchronize_full(); if (!error) error = async_error; if (!error) dpm_show_time(starttime, state, NULL); return error; } - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/3NhwACgkQtTMYHG2NR9VmvACfRce0tPUY+ZPquZBwvdIgSJ4R /skAn3Ob1R8/XpVpHY30HLN2Yute95GS =Adi6 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

12 Jul 12 Jul

18:48

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday, 2012-07-06 at 21:01 +0200, Carlos E. R. wrote:

...

On Friday, 2012-07-06 at 14:57 +0100, Dave Howorth wrote:

...
Ugh! Yet more function calls.

Yet more printk I suppose :(

Yes... I compiled them already, a few there and a few more in dpm_prepare().

New problem: after several days, the machine refused to hibernate: Running hook /usr/lib/pm-utils/sleep.d/75modules hibernate hibernate: Unloading kernel module button...FATAL: Module nouveau is in use. /usr/lib/pm-utils/functions: line 89: log: command not found Failed. Unloading kernel module button...FATAL: Module nouveau is in use. /usr/lib/pm-utils/functions: line 89: log: command not found Failed. Would that be a new bug? Two, actually, one with the button thing, another with non-existing log function. I did say that I had to use the nvidia proprietary driver because of problems with nouveau and hibernation. Here goes one... :-( I had to halt the machine instead. More waiting days till next crash and report. - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk//G+4ACgkQtTMYHG2NR9WeVQCfbgoc163lMc2tkekJSpnFcHSw T44An0RjZOD+Nl4LytOeRrcDR+hZacQp =l7IG -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Dave Howorth

13 Jul 13 Jul

10:23

Carlos E. R. wrote:

...

New problem: after several days, the machine refused to hibernate:

Running hook /usr/lib/pm-utils/sleep.d/75modules hibernate hibernate: Unloading kernel module button...FATAL: Module nouveau is in use. /usr/lib/pm-utils/functions: line 89: log: command not found Failed. Unloading kernel module button...FATAL: Module nouveau is in use. /usr/lib/pm-utils/functions: line 89: log: command not found Failed.

Would that be a new bug? Two, actually, one with the button thing, another with non-existing log function.

I did say that I had to use the nvidia proprietary driver because of problems with nouveau and hibernation. Here goes one... :-(

Well, at least it's an opportunity to bug report nouveau to get it improved :) It looks like the bug is upstream because there seem to be identical symptoms described here: https://bbs.archlinux.org/viewtopic.php?id=143461 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

11:51

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Content-ID: <alpine.LNX.2.00.1207131345190.18250@Telcontar.valinor> On Friday, 2012-07-13 at 11:23 +0100, Dave Howorth wrote:

...

Well, at least it's an opportunity to bug report nouveau to get it improved :)

Heh! But it will not be as I'm reporting for 11.4. <https://bugzilla.novell.com/show_bug.cgi?id=771363> Bug 771363 - Hibernation fails if nouveau is in use. and: <https://bugzilla.novell.com/show_bug.cgi?id=771366> Bug 771366 - /usr/lib/pm-utils/sleep.d/75modules calls log, which does not exist. The weird thing is that was not happening the previous days. The only thing that changed was some unrelated (I think) updates: INSTALLTIME BUILDTIME Tue Jul 10 2012 Tue Feb 22 2011 xcircuit 3.6.132-2.1 Tue Jul 10 2012 Tue Jun 19 2012 php5 5.3.5-341.1 Tue Jul 10 2012 Wed Jun 20 2012 python-crypto 2.3-6.1 Tue Jul 10 2012 Tue Jun 19 2012 krb5-32bit 1.8.3-16.49.1 Tue Jul 10 2012 Mon Jul 02 2012 pm-utils 1.4.1-5.32.1 Tue Jul 10 2012 Tue Jun 19 2012 krb5 1.8.3-16.49.1 Tue Jul 10 2012 Tue Jun 19 2012 clamav 0.97.5-10.1 Tue Jul 10 2012 Wed Jun 20 2012 libtiff3 3.9.4-28.1 Tue Jul 10 2012 Wed Jun 20 2012 libtiff3-32bit 3.9.4-28.1 Tue Jul 10 2012 Wed Jun 20 2012 tiff 3.9.4-28.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-bz2 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-ctype 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-mcrypt 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-pdo 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-iconv 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-json 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-mbstring 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-xmlwriter 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-zlib 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 krb5-devel 1.8.3-16.49.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-gd 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-hash 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 krb5-client 1.8.3-16.49.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-dom 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 clamav-db 0.97.5-10.1 Tue Jul 10 2012 Tue Jun 19 2012 apache2-mod_php5 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-tokenizer 5.3.5-341.1 Tue Jul 10 2012 Wed Jun 20 2012 libtiff-devel 3.9.4-28.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-mysql 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-sqlite 5.3.5-341.1 Tue Jul 10 2012 Tue Jun 19 2012 php5-xmlreader 5.3.5-341.1 I'm fearing that one of these loaded the module "button". Ah... pm-utils, yes! I should have known. I hacked the /usr/lib/pm-utils/sleep.d/75modules script - fortunately suspend is an area still using scripts, not like systemd ;-) - like this: suspend_modules() { [ -z "$SUSPEND_MODULES" ] && return $NA for x in $SUSPEND_MODULES ; do if ! test $x = button ; then printf "Unloading kernel module %s..." "$x" modunload $x && echo Done. || { touch $INHIBIT; echo Failed.; } fi done return 0 } so that the module is not unloaded and hibernate runs. It did run, and it crashed - but it is the usual crash. I hibernated today with the same hack and it succeded, so the hack is not involved. I will post the photo to Bugzilla, https://bugzilla.novell.com/attachment.cgi?id=498542 https://bugzilla.novell.com/attachment.cgi?id=498543 https://bugzilla.novell.com/attachment.cgi?id=498544

...

It looks like the bug is upstream because there seem to be identical symptoms described here:

https://bbs.archlinux.org/viewtopic.php?id=143461

Indeed. If as Jeff Mahoney said the culprit is a driver, and he suspected the audio usb driver, that I will try that way. I'm hibernating now this way: # echo ; date ; rcalsasound stop; pm-hibernate ; rcalsasound start ; date I got this when tested that today: +++···························· Telcontar:~ # lsmod | grep usb snd_usb_audio 116145 0 snd_pcm 102140 4 snd_pcm_oss,snd_hda_intel,snd_hda_codec,snd_usb_audio snd_hwdep 7708 2 snd_hda_codec,snd_usb_audio snd_usbmidi_lib 22933 1 snd_usb_audio snd_rawmidi 26712 1 snd_usbmidi_lib snd 82988 13 snd_pcm_oss,snd_mixer_oss,snd_seq,snd_hda_codec_ca0110,snd_hda_intel,snd_hda_codec,snd_usb_audio,snd_pcm,snd_timer,snd_hwdep,snd_usbmidi_lib,snd_rawmidi,snd_seq_device Telcontar:~ # rcalsasound stop Shutting down sound driver done Telcontar:~ # lsmod | grep usb Telcontar:~ # echo ; date ; rcalsasound stop ; pm-hibernate ; rcalsasound start ; date Fri Jul 13 11:57:37 CEST 2012 resuming... Starting sound driver: hda-intel done Fri Jul 13 11:58:41 CEST 2012 Telcontar:~ # lsmod | grep usb Telcontar:~ # ····························++- Notice that the usb drivers are not reloaded. - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAlAAC9EACgkQtTMYHG2NR9UX+wCeIVHqTaGR/6a7FNT+6WSiErKR xFwAn0DeDP3cRi1O8GW3opgV+DF9uY/X =L/Jf -----END PGP SIGNATURE-----

Carlos E. R.

14 Jul 14 Jul

22:41

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday, 2012-07-13 at 13:51 +0200, Carlos E. R. wrote: ...

...

If as Jeff Mahoney said the culprit is a driver, and he suspected the audio usb driver, that I will try that way. I'm hibernating now this way:

# echo ; date ; rcalsasound stop; pm-hibernate ; rcalsasound start ; date

It crashed at the end, same point. I'll try also stopping network first. This is a nightmare, I'm not getting anywhere :-/ - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAlAB9YsACgkQtTMYHG2NR9XwpACcDbZx6w3+ulq9XjK0u/DlSCIg NJsAnRGIZiP18khfAWrDpIP5ipU62bs1 =mlYl -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Dave Howorth

17 Jul 17 Jul

10:36

Carlos E. R. wrote:

...

...
If as Jeff Mahoney said the culprit is a driver, and he suspected the audio usb driver, that I will try that way. I'm hibernating now this way:

# echo ; date ; rcalsasound stop; pm-hibernate ; rcalsasound start ; date

It crashed at the end, same point.

I'll try also stopping network first. This is a nightmare, I'm not getting anywhere :-/

I read your conversation on opensuse-kernel. Depressing! :( At least somebody else is interested, and you've eliminated one possible source of the problem. I can't suggest much to help apart from the obvious ways to localise the problem: (1) see which other modules you can remove, either permanently or temporarily (an expert guesser like Jeff would be useful :) (2) keep adding more printk's to find the place where it's actually crashing. In some ways it's good that all the printks haven't changed the behaviour. It makes it seem more likely the problem is some data that has been corrupted whilst you're working, rather than something that is caused specifically when you hibernate. I wonder if there are any tools that can audit kernel data structures, or compile-time options to add more asserts() or other debugging options? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

12:55

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-17 12:36, Dave Howorth wrote:

...

Carlos E. R. wrote:

...
...
If as Jeff Mahoney said the culprit is a driver, and he suspected the

...

I read your conversation on opensuse-kernel. Depressing! :(

Yes... I thought we had finally found something, but no, we are at the same point.

...

At least somebody else is interested, and you've eliminated one possible source of the problem. I can't suggest much to help apart from the obvious ways to localise the problem:

(1) see which other modules you can remove, either permanently or temporarily (an expert guesser like Jeff would be useful :)

I'm out of ideas myself.

...

(2) keep adding more printk's to find the place where it's actually crashing.

At this point I'm running the standard kernel, not my hacked one, as I was getting nowhere, and the path of removing modules is different and did not need those messages.

...

In some ways it's good that all the printks haven't changed the behaviour. It makes it seem more likely the problem is some data that has been corrupted whilst you're working, rather than something that is caused specifically when you hibernate. I wonder if there are any tools that can audit kernel data structures, or compile-time options to add more asserts() or other debugging options?

If I knew what program triggered the bad behaviour... What I use normally is xterms, thunderbird, firefox. I normally use the nvidia driver, but it crashes equally with nouveau. I also normally use vmplayer, but it crashes equally without: the two weeks or so that I used my modified kernel I was using noveau and refrained from starting vmplayer (neither the service nor the images). Libreoffice Y normally used, but also it crashes without using it. Add asserts to the kernel, yes, but I don't know what to check. Maybe I have to use an experimental kernel from the 3.x series instead, if that is possible with 11.4; I know the evergreen people wanted to do that. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlAFYLIACgkQIvFNjefEBxqDOwCgi2GJQVTLItiq1jQ1yBNJO0u2 DN0An1IzTitlkTAWf4pNU9VoZkZThYld =XZV1 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

30 Jun 30 Jun

21:20

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Friday, 2012-06-29 at 15:53 +0100, Dave Howorth wrote:

...

OK, but as long as you use the nvidia driver, you will have to isolate exactly what is failing in the kernel and provide a method to reproduce the problem on hardware that can run an untainted kernel. Then the devs will look at it.

At the moment, my system is using the nouveau driver, and it is not my doing. It hapened after I compiled my own kernel, no idea why. Telcontar:~ # lsmod | grep "nou\|nv" nouveau 598669 2 ttm 72646 1 nouveau drm_kms_helper 35889 1 nouveau drm 230064 4 nouveau,ttm,drm_kms_helper i2c_algo_bit 6016 1 nouveau video 15451 1 nouveau button 6754 1 nouveau Telcontar:~ # And I got a crash on restore, instead than on hibernation. - -- Cheers, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAk/vbY8ACgkQtTMYHG2NR9WxpgCgkaa92waENiOcX6C36kRD6kjf 1iwAn1z/Fj2aREhIHi57Z6/+ucijTjve =CIIK -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

1 Jul 1 Jul

00:15

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-30 23:20, Carlos E. R. wrote:

...

On Friday, 2012-06-29 at 15:53 +0100, Dave Howorth wrote:

...
OK, but as long as you use the nvidia driver, you will have to isolate exactly what is failing in the kernel and provide a method to reproduce the problem on hardware that can run an untainted kernel. Then the devs will look at it.

At the moment, my system is using the nouveau driver, and it is not my doing. It hapened after I compiled my own kernel, no idea why.

New problem: Thunderbird and firefox now each use 70% CPU! Previously they kept under 5%. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/vlrEACgkQIvFNjefEBxoFGACeP1HRS44pWei4Hc4VmsPApms2 G0IAoIrnHbi3J+OOjBfcjWaMPP1T6wBH =PupO -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

00:29

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-01 02:15, Carlos E. R. wrote:

...

On 2012-06-30 23:20, Carlos E. R. wrote:

...

New problem:

Thunderbird and firefox now each use 70% CPU! Previously they kept under 5%.

If I don't use it for a while, it goes back to 1%, but as soon as I click somewhere, it goes to 70%. I don't understand it at all. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/vmdoACgkQIvFNjefEBxpEEgCgweFd6KK5SiGSvuNed33DBzoh zgYAn3LkWzszZ/pvk3VPyBWtnBz3z80M =3WoM -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

00:58

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-01 02:29, Carlos E. R. wrote:

...

On 2012-07-01 02:15, Carlos E. R. wrote:

...
On 2012-06-30 23:20, Carlos E. R. wrote:

...
New problem:

...
Thunderbird and firefox now each use 70% CPU! Previously they kept under 5%.

If I don't use it for a while, it goes back to 1%, but as soon as I click somewhere, it goes to 70%. I don't understand it at all.

And now I hibernated four times in succession, and both firefox and thunderbird remains at 1% cpu or less even when typing. Something weird with this kernel... - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/voMUACgkQIvFNjefEBxoHAwCfYMa3zRiB83cHdA9t1ZYQ067g gXwAoIfk1D0l7ml4Y/IAlpxrkYXpff4c =Me10 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

01:37

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-07-01 02:58, Carlos E. R. wrote:

...

On 2012-07-01 02:29, Carlos E. R. wrote:

...
On 2012-07-01 02:15, Carlos E. R. wrote:

...
On 2012-06-30 23:20, Carlos E. R. wrote:

...
...
New problem:

...
...
Thunderbird and firefox now each use 70% CPU! Previously they kept under 5%.

...
If I don't use it for a while, it goes back to 1%, but as soon as I click somewhere, it goes to 70%. I don't understand it at all.

And now I hibernated four times in succession, and both firefox and thunderbird remains at 1% cpu or less even when typing. Something weird with this kernel...

There are weird reports related to the leap second correction at 00:UTC and servers getting very hot at that hour, corrected by rebooting. Is it possible? Then it was not related to my compilation of the kernel. One thing less to worry about... - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/vqcUACgkQIvFNjefEBxrWggCfXA4VixJgm9X49s0tYp47myJv I3QAoJagI35soppop5X+o+DfnCvvItC+ =tvY6 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Per Jessen

29 Jun 29 Jun

19:26

Carlos E. R. wrote:

...

...
(a) copy your kernel output either to your serial port (to a terminal or file on another system) or over the network to another machine's kernel log - you may see some messages which don't make it to the machine's own screen

No possible, I already asked in the bugzilla if this could be done. s2disk doesn't output to serial port. During this phase there is only a text console working, messages are printed there, and when the problem happens nothing prints any message there. There is nothing nowhere.

Running a serial console might help you capture some output.

...

I have the knowledge to hack at my own s2disk, but I failed to create rpms for it, thus nobody else can try my hacks unless a dev creates them.

Well, everyone else could just use your hacks to begin with. -- Per Jessen, Zürich (26.9°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

20:01

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-29 21:26, Per Jessen wrote:

...

Carlos E. R. wrote:

...

Running a serial console might help you capture some output.

The serial port does not capture more text than goes to the terminal, I have used it. It is of use to see that output when you are using a graphical session, or to capture on file what can not be saved on the local disk because it crashes. I have used it, I have the hardware. And the terminal sees no output, the serial port is useless - or they would have told me when I asked.

...

...
I have the knowledge to hack at my own s2disk, but I failed to create rpms for it, thus nobody else can try my hacks unless a dev creates them.

Well, everyone else could just use your hacks to begin with.

True, but I can not create rpms for it. And no dev has said anything on the bugzilla. Some looked, but no one comments on it. Why? - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/uCbAACgkQIvFNjefEBxre1wCffCzi38TPVemEGRk5yxcWXtdr 2hgAoMkSb0Fq9I6nI0mjE8MlqDsXGIOO =FdZn -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Per Jessen

30 Jun 30 Jun

07:23

Carlos E. R. wrote:

...

...
...
I have the knowledge to hack at my own s2disk, but I failed to create rpms for it, thus nobody else can try my hacks unless a dev creates them.

Well, everyone else could just use your hacks to begin with.

True, but I can not create rpms for it. And no dev has said anything on the bugzilla. Some looked, but no one comments on it. Why?

The usual - too busy or not interested. Lots of reports lie around for months. -- Per Jessen, Zürich (23.3°C) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

David Haller

29 Jun 29 Jun

19:41

Hello, On Fri, 29 Jun 2012, Carlos E. R. wrote:

...

Linux Telcontar 2.6.37.6-0.20-desktop #1 SMP PREEMPT 2011-12-19 23:39:38 +0100 x86_64 x86_64 x86_64 GNU/Linux

Have you tried a -default kernel, esp. one not PREEMPT? -dnh -- How do I type "for i in *.dvi; do xdvi $i; done" in a GUI? -- Discussion in comp.os.linux.misc on the intuitiveness of interfaces. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

23:06

On 2012-06-29 21:41, David Haller wrote:

...

Hello,

On Fri, 29 Jun 2012, Carlos E. R. wrote:

...
Linux Telcontar 2.6.37.6-0.20-desktop #1 SMP PREEMPT 2011-12-19 23:39:38 +0100 x86_64 x86_64 x86_64 GNU/Linux

Have you tried a -default kernel, esp. one not PREEMPT?

No, I haven't. Why, do you you think that would be different? :-? -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar)

David Haller

30 Jun 30 Jun

02:35

Hello, On Sat, 30 Jun 2012, Carlos E. R. wrote:

...

On 2012-06-29 21:41, David Haller wrote:

...
On Fri, 29 Jun 2012, Carlos E. R. wrote:

...
Linux Telcontar 2.6.37.6-0.20-desktop #1 SMP PREEMPT 2011-12-19 23:39:38 +0100 x86_64 x86_64 x86_64 GNU/Linux

Have you tried a -default kernel, esp. one not PREEMPT?

No, I haven't.

Why, do you you think that would be different? :-?

ISTR, that I read something somewhere to the effect with PREEMPT kernel. It just "rang a bell" when I read your mail... And testing the -default kernel should not be much trouble, esp. if you have 'multiversion' enabled in /etc/zypp/zypp.conf. -dnh -- Oh, I am a C programmer and I'm okay I muck with indices and structs all day And when it works, I shout hoo-ray Oh, I am a C programmer and I'm okay [BSD fortune file] -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

Carlos E. R.

11:00

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2012-06-30 04:35, David Haller wrote:

...

Hello,

...

...
...
Have you tried a -default kernel, esp. one not PREEMPT?

No, I haven't.

Why, do you you think that would be different? :-?

ISTR, that I read something somewhere to the effect with PREEMPT kernel. It just "rang a bell" when I read your mail...

And testing the -default kernel should not be much trouble, esp. if you have 'multiversion' enabled in /etc/zypp/zypp.conf.

Right, but the thing is, if it doesn't crash immediately I may have to wait two weeks till I can confirm it works. Or, I could use my kernel with messages, but that setting changed. - -- Cheers / Saludos, Carlos E. R. (from 11.4 x86_64 "Celadon" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/u3GoACgkQIvFNjefEBxqC6ACfQ1K89aoIWYt7iP07h/nN1jqC xiwAnR3e2jxuLNYFy1GhZ3PpN4rC+KY9 =yuAM -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org

4535

Age (days ago)

4554

Last active (days ago)

List overview

Download

50 comments

9 participants

participants (9)

Carlos E. R.
Carlos E. R.
Carlos E. R.
Dave Howorth
David Haller
Malcolm
Patrick Shanahan
Per Jessen
Robert Rawlinson