[opensuse] Kernel crash on multiple file write on reiserfs GPT partition.
Hi, I am hitting a kernel crash and BUG when doing repeated file writes on a GPT partition, made as reiserfs. The machine locks completely, even keyboard LEDs do not change state. The code is this (trimmed for clarity): for Z in `seq 1 3000`; do dd if=/dev/zero of=/mnt/test/fichero_$Z count=1 bs=1M conv=fdatasync >> logfile 2>&1. done ls -lh /mnt/test/* > /dev/null 2>&1. rm /mnt/test/fichero* 2>&1 | tee -a some_log What I see on screen is that it apparently writes the 3000 thousand files, deletes them (all the operations are timed), and then this, hand copied from a screen photo with camera: ************************************* [62148.7840471] BUG: unable to handle kernel paging request at ffffc90019d54250 [62148.7840454] IP: [<ffffffff8105e7a9>] get_next_timer_interrupt+0xa9/0x270 [62148.7840456] PGD 23f027067 PUD 23f028067 PMD 19b255067 PTE 0 [62148.7840457] Oops: 0000 [#1] PREEMPT SMP ************************************* I will post the photo later. (it is not fully clear what are '8' or '0') The crash does not happen if I remove the "conv=fdatasync" from the 'dd' line. What I see on the "logfile" is (last lines): +++,,,,,,,,,,,,,,,,,,,,,,,,,, 1048576 bytes (1.0 MB) copied, 0.040832 s, 25.7 MB/s 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0407661 s, 25.7 MB/s 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0575303 s, 18.2 MB/s ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ............................................................................................................................ ,,,,,,,,,,,,,,,,,,,,,,,,,,++- What I see on the "some_log" is (last lines): +++,,,,,,,,,,,,,,,,,,,,,,,,,, --------------------------------------------------------------- **** Prueba escritura miles ficheros pequeños en la particion 16 **** Inicio a 2014-05-23 05:50:33.701240101+02:00 ==== Escritos 3000 ficheros en 141 segundos (Part 16) Listando particion 16 ==== Listados 3000 ficheros en 0 segundos (Part 16) Borrando 3000 ficheros ==== Borrados 3000 ficheros en 2 segundos (Part 16) **** Fin a 2014-05-23 05:52:56.309504061+02:00 --------------------------------------------------------------- **** Prueba escritura miles ficheros pequeños en la particion 17 **** Inicio a 2014-05-23 05:52:56.404005507+02:00 ,,,,,,,,,,,,,,,,,,,,,,,,,,++- It appears that the last entries on the logs get corrupted, and do not match completely what gets displayed on the screen. What I see on /var/log/messages is: +++,,,,,,,,,,,,,,,,,,,,,,,,,, <0.5> 2014-05-23 05:50:33 Telcontar kernel - - - [61860.598210] REISERFS (device sde16): found reiserfs format "3.6" with standard journal <0.5> 2014-05-23 05:50:33 Telcontar kernel - - - [61860.598218] REISERFS (device sde16): using ordered data mode <0.4> 2014-05-23 05:50:33 Telcontar kernel - - - [61860.598220] reiserfs: using flush barriers <0.5> 2014-05-23 05:50:33 Telcontar kernel - - - [61860.622403] REISERFS (device sde16): journal params: device sde16, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 <0.5> 2014-05-23 05:50:33 Telcontar kernel - - - [61860.622712] REISERFS (device sde16): checking transaction log (sde16) <0.5> 2014-05-23 05:50:33 Telcontar kernel - - - [61860.643359] REISERFS (device sde16): Using r5 hash to sort names <0.5> 2014-05-23 05:52:56 Telcontar kernel - - - [62003.293181] REISERFS (device sde17): found reiserfs format "3.6" with standard journal <0.5> 2014-05-23 05:52:56 Telcontar kernel - - - [62003.293189] REISERFS (device sde17): using ordered data mode <0.4> 2014-05-23 05:52:56 Telcontar kernel - - - [62003.293191] reiserfs: using flush barriers <0.5> 2014-05-23 05:52:56 Telcontar kernel - - - [62003.315677] REISERFS (device sde17): journal params: device sde17, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 <0.5> 2014-05-23 05:52:56 Telcontar kernel - - - [62003.315994] REISERFS (device sde17): checking transaction log (sde17) <0.5> 2014-05-23 05:52:56 Telcontar kernel - - - [62003.339970] REISERFS (device sde17): Using r5 hash to sort names <3.6> 2014-05-23 05:53:01 Telcontar systemd 1 - - Starting Session 471 of user news. ......................................................................................2014-05-23 10:00:43+02:00 - Booting the system now ================================================================================ Linux Telcontar 3.11.10-11-desktop #1 SMP PREEMPT Mon May 12 13:37:06 UTC 2014 (3d22b5f) x86_64 x86_64 x86_64 GNU/Linux <0.6> 2014-05-23 10:00:45 Telcontar kernel - - - [ 0.000000] Initializing cgroup subsys cpuset ,,,,,,,,,,,,,,,,,,,,,,,,,,++- The detail about "Starting Session 471 of user news" may be important. It is a cronjob that starts leafnode nntp fetch, and the partition dedicated to news storage is a reiserfs one. I had the crash happen several times precisely at that point, but not all, apparently. There is also some corruption in the log file of the fetchnews run, although the job itself succeeds: +++,,,,,,,,,,,,,,,,,,,,,,,,,, ================> 2014-05-23 05:53:01.738753848+02:00 Start fetchnews session WARNING: Make sure that syslog.conf captures news.debug logging -------- and obtain your debug output from syslog. WARNING: The screen output below is not sufficient. Check syslog! leafnode 1.11.10: verbosity level is 1, debugmode is 1 try_lock(timeout=5), fqdn="Telcontar.valinor" nntp.opensuse.org: connecting to port nntp... nntp.opensuse.org: connected to 130.57.2.16:119, reply: 200 nntp.opensuse.org: connected. nntp.opensuse.org: using STAT <message-ID> command. nntp.opensuse.org: 0 articles posted. nntp.opensuse.org: getting new newsgroups nntp.opensuse.org: got 0 new newsgroups. nntp.opensuse.org: conversation completed, disconnected. nntp.novell.com: connecting to port nntp... nntp.novell.com: connected to 130.57.2.15:119, reply: 200 nntp.novell.com: connected. nntp.novell.com: using STAT <message-ID> command. nntp.novell.com: 0 articles posted. nntp.novell.com: getting new newsgroups nntp.novell.com: got 0 new newsgroups. nntp.novell.com: conversation completed, disconnected. wrote active file with 342 lines Started process to update overview data in the background. Network activity has finished. ================> 2014-05-23 05:53:22.127932178+02:00 End fetchnews session ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@================> 2014-05-23 10:03:01.349875385+02:00 Start fetchnews session ,,,,,,,,,,,,,,,,,,,,,,,,,,++- But it finished correctly before the crash, it is only the log (on an ext4 partition) that gets corrupted (the "^@^@^" chars) +++,,,,,,,,,,,,,,,,,,,,,,,,,, <7.7> 2014-05-23 05:53:22 Telcontar fetchnews 22699 - - <211 3064 2 3083 opensuse.org.help.virtualization <7.6> 2014-05-23 05:53:22 Telcontar fetchnews 22699 - - opensuse.org.help.virtualization: no new articles <7.7> 2014-05-23 05:53:22 Telcontar fetchnews 22699 - - >QUIT <7.6> 2014-05-23 05:53:22 Telcontar fetchnews 22699 - - wrote active file with 342 lines <7.7> 2014-05-23 05:53:22 Telcontar fetchnews 23145 - - Process forked. <7.6> 2014-05-23 05:53:22 Telcontar fetchnews 22699 - - child has process ID 23145 <7.7> 2014-05-23 05:53:22 Telcontar fetchnews 23145 - - Process done. <7.7> 2014-05-23 10:03:01 Telcontar fetchnews 4153 - - config: debugmode is 1 <7.7> 2014-05-23 10:03:01 Telcontar fetchnews 4153 - - config: maxage is 0 ,,,,,,,,,,,,,,,,,,,,,,,,,,++- The code I'm running is a script, that I can post later if wanted, that does a sequence of tests on a 3TB hard disk with 19 GPT partitions: formats all partitions repeat 3 times run hdparm -tT on all partitions creates and deletes 3 * 4G files on all partitions creates and deletes 3000 * 1M files on all partitions This is done for xfs, ext4, btrfs, and reiserfs. It only crashes on reiserfs, randomly, on the small file test, on a different partition each time, and on any of the 3 runs. I will now repeat the test with reiserfs only, leaving active tty10, in the hope of capturing the complete Oops text. -- Cheers Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
В Fri, 23 May 2014 11:32:53 +0200 (CEST) "Carlos E. R." <carlos.e.r@opensuse.org> пишет:
Hi,
I am hitting a kernel crash and BUG when doing repeated file writes on a GPT partition, made as reiserfs.
Did you report it to bugzilla? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-23 19:00, Andrey Borzenkov wrote:
В Fri, 23 May 2014 11:32:53 +0200 (CEST) "Carlos E. R." <carlos.e.r@opensuse.org> пишет:
Hi,
I am hitting a kernel crash and BUG when doing repeated file writes on a GPT partition, made as reiserfs.
Did you report it to bugzilla?
Not yet. First, because I was waiting for comments, and second, because I'm running the test again, with the display permanently on tty10, and setterm -blank 0 -store setterm -powerdown 0 setterm -powersave off to impede it blanking off. I hope to capture a longer kernel oops message. I did it before, but the screen blanked, and with the keyboard not responding, I could not get it back and read it. So I repeated with the above trick, went for a siesta, and... it did not crash. So I'm running it again. Sigh. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/sD0ACgkQja8UbcUWM1yuTAD/erFcW1jU6tnHCrTuBkEQx6Vn x3yKhoJLDiIrEhSvRLQBAJCExhe4AvpYG0WWVHDl0vUqAmh8P/LxXvgiU1ZEQkhg =CVUw -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
El 23/05/14 16:31, Carlos E. R. escribió:
On 2014-05-23 19:00, Andrey Borzenkov wrote:
В Fri, 23 May 2014 11:32:53 +0200 (CEST) "Carlos E. R." <carlos.e.r@opensuse.org> пишет:
Hi,
I am hitting a kernel crash and BUG when doing repeated file writes on a GPT partition, made as reiserfs.
Did you report it to bugzilla?
Not yet. First, because I was waiting for comments,
Whenever userspace triggers a kernel crash it is a bug that needs to be fixed. The only exception is when a particular feature is designed to purposely crash the kernel ;-) -- Cristian "I don't know the key to success, but the key to failure is trying to please everybody." -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-23 22:38, Cristian Rodríguez wrote:
El 23/05/14 16:31, Carlos E. R. escribió:
On 2014-05-23 19:00, Andrey Borzenkov wrote:
В Fri, 23 May 2014 11:32:53 +0200 (CEST) "Carlos E. R." <> пишет:
Did you report it to bugzilla?
Not yet. First, because I was waiting for comments,
Whenever userspace triggers a kernel crash it is a bug that needs to be fixed. The only exception is when a particular feature is designed to purposely crash the kernel ;-)
I got the crash and error dump on tty10. While I was looking at it, I noticed that the mouse (gpm) still responded. Then I got a new page of more messages, and the screen locked hard. I can not get at the first error messages :-/ If you know of a way to continuously dump a copy of kernel messages to another machine via network, that doesn't die too fast, I'm listening. I'll make a photo of what I have and post it. It mentions a kernel panic "watchdog detected hard lockup on cpu 2". Shuttin down cpus with NMI. Another one about smp.c and update_process_times. Then some messages, and it locks. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/xoQACgkQja8UbcUWM1xOnQEAjs7XRe0B79Re9MusagQiQtxg 1iY6e7kjtkAQ0goOHR0BAIp+ibUFerpCga/Ioctc5rHz3twaI+629G98oFWFRuBy =Wrca -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Saturday 24 of May 2014 00:07:00 Carlos E. R. wrote:
If you know of a way to continuously dump a copy of kernel messages to another machine via network, that doesn't die too fast, I'm listening.
Perhaps you would give kdump a try, which allows to store the crash log and the memory image to another computer or even the same, depending on how serious the crash is. -- Regards, Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 00:18, auxsvr@gmail.com wrote:
On Saturday 24 of May 2014 00:07:00 Carlos E. R. wrote:
If you know of a way to continuously dump a copy of kernel messages to another machine via network, that doesn't die too fast, I'm listening.
Perhaps you would give kdump a try, which allows to store the crash log and the memory image to another computer or even the same, depending on how serious the crash is.
The crash is complete and final. I need something that dumps data while the crash is happening with no user intervention whatsoever, to another computer, and as low level as possible, as I don't know what still works at that instant. Maybe nothing. Years ago I would consider serial port, but the second machine has none. kdump has no man page, so that's a dead end. +++······················. KDUMP(8) System Manager's Manual KDUMP(8) NAME kdump - This is just a placeholder until real man page has been written ······················.++- - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/zCwACgkQja8UbcUWM1xBggD/U6e9e3Ipugxt3oPni1KdHdVp 7BJhhr3lpLM9auCIPHsBAIKEatUUFrz2PyZCin05y7vssiSG2xo1SvMdd+QKjOKd =Il5g -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Saturday 24 of May 2014 00:31:08 Carlos E. R. wrote:
Perhaps you would give kdump a try, which allows to store the crash log and the memory image to another computer or even the same, depending on how serious the crash is.
The crash is complete and final.
I need something that dumps data while the crash is happening with no user intervention whatsoever, to another computer, and as low level as possible, as I don't know what still works at that instant. Maybe nothing. Years ago I would consider serial port, but the second machine has none.
Try http://doc.opensuse.org/documentation/html/openSUSE_114/opensuse-tuning/cha..... Kdump is a mechanism to kexec a new kernel, which is stored in memory explicitly reserved for it when the running kernel hangs. After that, it should send the logs over the network or save them to local filesystem, but the latter is risky if the filesystem causes the crash. Last time I used it, I recall that the default amount of memory reserved was too low for a modern initrd. If you don't set it high enough in Yast, the kdump image fails to start without displaying any error. -- Regards, Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 00:07, Carlos E. R. wrote:
I'll make a photo of what I have and post it.
<http://susepaste.org/76184129> - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/0NEACgkQja8UbcUWM1zLuQD/V55QU8FApZGKChMK91H+VIQB kSFUx0/fXAld2fkmtVgA/jRr/GvXl/mBlayDiyME1GYErCjFJ8DXkOlc4CPcyIjk =SGZf -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
El 23/05/14 18:50, Carlos E. R. escribió:
On 2014-05-24 00:07, Carlos E. R. wrote:
I'll make a photo of what I have and post it.
Your kernel says: P D O .. that means: "P" --> propietary module loaded, developers will most likely ignore your report if it comes in this form. "D" --> the kernel has oopsed before, that means what you are showing in the picture is a secondary oops, not the actual problem. "O" -> "Out of tree module" is loaded, good luck with getting that fixed. -- Cristian "I don't know the key to success, but the key to failure is trying to please everybody." -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-05-24 01:02, Cristian Rodríguez wrote:
El 23/05/14 18:50, Carlos E. R. escribió:
On 2014-05-24 00:07, Carlos E. R. wrote:
I'll make a photo of what I have and post it.
Your kernel says:
P D O .. that means:
"P" --> propietary module loaded, developers will most likely ignore your report if it comes in this form.
Well, there is the nvidia driver. But I'm running in text mode, graphics were not loaded. I'll try blacklisting nvidia.
"D" --> the kernel has oopsed before, that means what you are showing in the picture is a secondary oops, not the actual problem.
It flowed up of the picture. I told you I needed something to capture those messages on another computer before it dies, automatically, in about 5 seconds. One suggestion is kdump, but besides being very complicated I'm not sure it can be used for just the messages.
"O" -> "Out of tree module" is loaded, good luck with getting that fixed.
No idea what is that. The kernel is entirely from openSUSE. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith))
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 El 23/05/14 19:14, Carlos E. R. escribió:
It flowed up of the picture. I told you I needed something to capture those messages on another computer before it dies, automatically, in about 5 seconds. One suggestion is kdump, but besides being very complicated I'm not sure it can be used for just the messages.
See if the information was written to the journal or not.. you could use "netconsole". or a serial port to capture the actual oops if does not show up in the logs.
"O" -> "Out of tree module" is loaded, good luck with getting that fixed.
No idea what is that. The kernel is entirely from openSUSE.
Probably the nvidia module too. - -- Cristian "I don't know the key to success, but the key to failure is trying to please everybody." -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBAgAGBQJTf9dzAAoJELaOM98avVCBTL0H/0OsLmZZP+hGHq3X3LqZDC6/ 4bdG94uHgv6Ivjq7WymRxQyrwGbMznxI/2VoTPgx5BHek0MowTGmJwrJEN4vTUqX EaETh/Jo74UVa7j4SBkp4FvExMJKWCBr873ngDVX5sMlFjUAzu75+nGEjRCUtIdi QrhJmIjTxyLP7EFE6yeAbPlzvHG0h0LUGJn6jtjHSjSVWQZObrWf/7T0uV4eQdk+ DKgH0r/am5aIn98b3RFuSMqgpVStROdSHnTytfAM303xMyhZxgEhnDVqSLJj+bMs 8ev4ukbxaj4a54ocOG91SPanw/p7BeY7pGojo1Wg/ns0+ymiK5cuZmnyFbq79ug= =wDGa -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 01:19, Cristian Rodríguez wrote:
El 23/05/14 19:14, Carlos E. R. escribió:
See if the information was written to the journal or not.. you could use "netconsole". or a serial port to capture the actual oops if does not show up in the logs.
Nothing is in the logs, the filesystem dies. Can you point to a document that explains how to use netconsole, please? There is nothing in our wiki search.
"O" -> "Out of tree module" is loaded, good luck with getting that fixed.
No idea what is that. The kernel is entirely from openSUSE.
Probably the nvidia module too.
Ok, how can I block nvidia from loading? I renamed /usr/src/kernel-modules/nvidia-331.67-desktop/nvidia.ko to /usr/src/kernel-modules/nvidia-331.67-desktop/nvidia.ko.nousar, and in /etc/modprobe.d/50-blacklist.conf I added blacklist nvidia I rebooted and it still loads, in text mode!
Telcontar:~ # locate nvidia.ko /lib/modules/3.11.10-11-desktop/weak-updates/updates/nvidia.ko /lib/modules/3.11.10-7-desktop/weak-updates/updates/nvidia.ko /lib/modules/3.11.6-4-desktop/updates/nvidia.ko /root/tmp/expand/lib/modules/3.11.10-7-desktop/weak-updates/updates/nvidia.ko
/usr/share/doc/nvidia/NVIDIA-Linux-x86-1.0-7664-pkg1/usr/src/nv/nvidia.ko
/usr/src/kernel-modules/nvidia-331.49-desktop/.nvidia.ko.cmd /usr/src/kernel-modules/nvidia-331.49-desktop/nvidia.ko /usr/src/kernel-modules/nvidia-331.67-desktop/.nvidia.ko.cmd /usr/src/kernel-modules/nvidia-331.67-desktop/nvidia.ko
Telcontar:~ # l /lib/modules/3.11.10-11-desktop/weak-updates/updates/nvidia.ko lrwxrwxrwx 1 root root 47 May 22 04:42 /lib/modules/3.11.10-11-desktop/weak-updates/updates/nvidia.ko -> /lib/modules/3.11.6-4-desktop/updates/nvidia.ko Telcontar:~ # l /lib/modules/3.11.6-4-desktop/updates/nvidia.ko ls: cannot access /lib/modules/3.11.6-4-desktop/updates/nvidia.ko: No such file or directory Telcontar:~ #
- -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/2y8ACgkQja8UbcUWM1zF5gD/YNAK1UWC/ErS5B5tIb/3CTEv LPrZAU7ibDKyLmMJOg0A/1ocFYsSNiJFNNaX6WxQ8yvCbe6irlRI4B+HGjlxsn+P =bF6/ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
El 23/05/14 19:35, Carlos E. R. escribió:
On 2014-05-24 01:19, Cristian Rodríguez wrote:
El 23/05/14 19:14, Carlos E. R. escribió:
See if the information was written to the journal or not.. you could use "netconsole". or a serial port to capture the actual oops if does not show up in the logs.
Nothing is in the logs, the filesystem dies.
Can you point to a document that explains how to use netconsole, please? There is nothing in our wiki search.
https://www.kernel.org/doc/Documentation/networking/netconsole.txt
I rebooted and it still loads, in text mode!
it is in the initrd then, mkinitrd . -- Cristian "I don't know the key to success, but the key to failure is trying to please everybody." -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 01:46, Cristian Rodríguez wrote:
El 23/05/14 19:35, Carlos E. R. escribió:
Can you point to a document that explains how to use netconsole, please? There is nothing in our wiki search.
https://www.kernel.org/doc/Documentation/networking/netconsole.txt
Ah. Ok it appears to be the same as in "/usr/share/doc/packages/netconsole-tools/netlogging.txt" Ok, on destination, I opened por tcp and udp 6666 on firewall, and have running: netcat -u -l 6666 | tee remote_log But in the problem machine, it does not work: Telcontar:~ # insmod netconsole netconsole=6666@192.168.1.15/ insmod: can't read 'netconsole': No such file or directory Telcontar:~ # I have installed "netconsole-tools-20030909-152.1.2.noarch", and it contains: /sbin/netconsole-server /usr/share/doc/packages/netconsole-tools /usr/share/doc/packages/netconsole-tools/netlogging.txt I have the correct package, and I'm using the documented syntax. What is wrong?
I rebooted and it still loads, in text mode!
it is in the initrd then, mkinitrd .
That was it! I should have thought of that. I now get: Telcontar:~ # lsmod | grep nv nvidiafb 49594 0 fb_ddc 12525 1 nvidiafb i2c_algo_bit 13413 1 nvidiafb vgastate 16826 1 nvidiafb Telcontar:~ # But I see no message in the log about tainting, so it should be correct. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/5GUACgkQja8UbcUWM1ylQQD/THBW5MemYazlh6yef/2/cUDz YJIvNyTikLe2z26ilWABAJ860uh+cz9nwOx0g3Upv/iaXV3q+coc+90m1M9L5lAX =pMq3 -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 02:14, Carlos E. R. wrote:
On 2014-05-24 01:46, Cristian Rodríguez wrote:
I have the correct package, and I'm using the documented syntax. What is wrong?
Tried different way: Telcontar:~ # modprobe netconsole netconsole="6666@192.168.1.15/" FATAL: Error inserting netconsole (/lib/modules/3.11.10-11-desktop/kernel/drivers/net/netconsole.ko): Operation not permitted Telcontar:~ # - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/5fkACgkQja8UbcUWM1x9ggD+KbE3FjTLPtF6+KY0lVwIUtjA y9F9BkZqTRoe0mlPz0sA/RPzuuHz89TMVw/eX6xDMYjAqXz3BsxtHU72JLkQdkNN =nmAu -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 02:21, Carlos E. R. wrote:
On 2014-05-24 02:14, Carlos E. R. wrote:
On 2014-05-24 01:46, Cristian Rodríguez wrote:
I have the correct package, and I'm using the documented syntax. What is wrong?
Tried different way:
Telcontar:~ # modprobe netconsole netconsole="6666@192.168.1.15/" FATAL: Error inserting netconsole (/lib/modules/3.11.10-11-desktop/kernel/drivers/net/netconsole.ko):
Operation not permitted
Telcontar:~ #
And I see this in the log:
<0.6> 2014-05-24 02:21:45 Telcontar kernel - - - [ 1668.278609] netpoll: netconsole: couldn't parse config at ''! <0.3> 2014-05-24 02:21:45 Telcontar kernel - - - [ 1668.278612] netconsole: cleaning up <0.6> 2014-05-24 02:22:23 Telcontar kernel - - - [ 1706.737995] netpoll: netconsole: couldn't parse config at '192.168.1.15'! <0.3> 2014-05-24 02:22:23 Telcontar kernel - - - [ 1706.737998] netconsole: cleaning up
I tried other combinations:
Telcontar:~ # modprobe netconsole netconsole="6666@192.168.1.14, 6666@192.168.1.15" FATAL: Error inserting netconsole (/lib/modules/3.11.10-11-desktop/kernel/drivers/net/netconsole.ko): Operation not permitted Telcontar:~ # modprobe netconsole netconsole="6666@192.168.1.14,6666@192.168.1.15" FATAL: Error inserting netconsole (/lib/modules/3.11.10-11-desktop/kernel/drivers/net/netconsole.ko): Operation not permitted Telcontar:~ # modprobe netconsole netconsole="6666@192.168.1.14/eth0,6666@192.168.1.15" FATAL: Error inserting netconsole (/lib/modules/3.11.10-11-desktop/kernel/drivers/net/netconsole.ko): Operation not permitted Telcontar:~ # modprobe netconsole netconsole="666@/eth0,6666@192.168.1.15" FATAL: Error inserting netconsole (/lib/modules/3.11.10-11-desktop/kernel/drivers/net/netconsole.ko): Operation not permitted Telcontar:~ # modprobe netconsole netconsole="@/,6666@192.168.1.15" FATAL: Error inserting netconsole (/lib/modules/3.11.10-11-desktop/kernel/drivers/net/netconsole.ko): Operation not permitted Telcontar:~ #
I don't understand :-( - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/58YACgkQja8UbcUWM1yD8wD+KjAAN7FieYBZUPSRtAeFWlfH efdvGneYSTXi3HsEPeQBAI17kKXwVLjDa/VfqlmnLz1bFcPw0RiDOzS9UBVJ7ZJa =6qEt -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 02:28, Carlos E. R. wrote:
I don't understand :-(
The documentation is obsolete. The correct syntax appears to be: modprobe netconsole 6666@192.168.1.14/eth0,6666@192.168.1.15 which I got from "http://www.cyberciti.biz/tips/linux-netconsole-log-management-tutorial.html". not modprobe netconsole netconsole="... syslog confirms: <3.6> 2014-05-24 02:48:01 Telcontar systemd 1 - - Starting Session 23 of user news. <0.4> 2014-05-24 02:48:10 Telcontar kernel - - - [ 3253.473093] netconsole: unknown parameter '6666@192' ignored <0.6> 2014-05-24 02:48:10 Telcontar kernel - - - [ 3253.479327] console [netcon0] enabled <0.6> 2014-05-24 02:48:10 Telcontar kernel - - - [ 3253.485343] netconsole: network logging started Maybe... But nothing appears on the remote machine. I plugged an usb stick, I saw messages on the tty10, but nothing on the remote machine. I confirmed with ethereal that nothing at all was sent to 192.168.1.15 I'm stuck. I tried: Telcontar:~ # modprobe netconsole "@,6666@192.168.1.15" the module is loaded, but it does not accept the parameters, so it does not send anything. See syslog: <0.4> 2014-05-24 03:02:22 Telcontar kernel - - - [ 4105.288427] netconsole: unknown parameter '@,6666@192' ignored <0.6> 2014-05-24 03:02:22 Telcontar kernel - - - [ 4105.295999] console [netcon0] enabled <0.6> 2014-05-24 03:02:22 Telcontar kernel - - - [ 4105.296000] netconsole: network logging started So.... How exactly do I do it? Someone has done this and knows how to do it in openSUSE, no hearsay? I have also tried, from another machine: cer@minas-tirith:~> echo hello > netcat -u 192.168.1.15 6666 cer@minas-tirith:~> and nothing appears on the listening netcat. So that part is also wrong! - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlN/+bQACgkQja8UbcUWM1yzwQEAkKllhWm6DTooCWWUeob4/uXg jl7znfvcI8/h2243J84A/RbW6iR220jpKR29l7LQu3aWjv/IfcQkO670g5L61gRr =mn6O -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
El 23/05/14 21:45, Carlos E. R. escribió:
So.... How exactly do I do it? Someone has done this and knows how to do it in openSUSE, no hearsay?
Try with the section "dynamic configuration" from the netconsole.txt doc. This reminded me I should try adding netconsole support to systemd-networkd so it is easy to setup in da future.. -- Cristian "I don't know the key to success, but the key to failure is trying to please everybody." -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 04:22, Cristian Rodríguez wrote:
El 23/05/14 21:45, Carlos E. R. escribió:
So.... How exactly do I do it? Someone has done this and knows how to do it in openSUSE, no hearsay?
Try with the section "dynamic configuration" from the netconsole.txt doc.
Telcontar:~ # modprobe netconsole Telcontar:~ # cd /sys/kernel/config/netconsole/ Telcontar:/sys/kernel/config/netconsole # ls Telcontar:/sys/kernel/config/netconsole # mkdir target1 Telcontar:/sys/kernel/config/netconsole # ls target1 Telcontar:/sys/kernel/config/netconsole # cd target1/ Telcontar:/sys/kernel/config/netconsole/target1 # ls dev_name enabled local_ip local_mac local_port remote_ip remote_mac remote_port Telcontar:/sys/kernel/config/netconsole/target1 # cat local_ local_ip local_mac local_port Telcontar:/sys/kernel/config/netconsole/target1 # cat local_ip 0.0.0.0 Telcontar:/sys/kernel/config/netconsole/target1 # echo 192.168.1.14 > local_ip Telcontar:/sys/kernel/config/netconsole/target1 # echo 6666 > local_port but Telcontar:/sys/kernel/config/netconsole/target1 # echo "00:21:85:16:2D:0B" > local_mac - -bash: local_mac: Permission denied Telcontar:/sys/kernel/config/netconsole/target1 # cat local_mac ff:ff:ff:ff:ff: weird. Telcontar:/sys/kernel/config/netconsole/target1 # echo "00:03:0D:05:17:FC" > remote_mac Telcontar:/sys/kernel/config/netconsole/target1 # echo 6666 > remote_port Telcontar:/sys/kernel/config/netconsole/target1 # echo 192.168.1.15 > remote_ip Telcontar:/sys/kernel/config/netconsole/target1 # cat dev_name eth0 Telcontar:/sys/kernel/config/netconsole/target1 # echo 1 > enabled It is apparently started: Telcontar:/sys/kernel/config/netconsole/target1 # tail /var/log/messages <3.6> 2014-05-24 13:23:01 Telcontar systemd 1 - - Starting Session 78 of user news. <3.6> 2014-05-24 13:25:01 Telcontar systemd 1 - - Starting Session 79 of user news. <3.6> 2014-05-24 13:28:01 Telcontar systemd 1 - - Starting Session 80 of user news. <0.6> 2014-05-24 13:28:06 Telcontar kernel - - - [10768.603827] netpoll: netconsole: local port 6666 <0.6> 2014-05-24 13:28:06 Telcontar kernel - - - [10768.609384] netpoll: netconsole: local IPv4 address 192.168.1.14 <0.6> 2014-05-24 13:28:06 Telcontar kernel - - - [10768.614762] netpoll: netconsole: interface 'eth0' <0.6> 2014-05-24 13:28:06 Telcontar kernel - - - [10768.620095] netpoll: netconsole: remote port 6666 <0.6> 2014-05-24 13:28:06 Telcontar kernel - - - [10768.625373] netpoll: netconsole: remote IPv4 address 192.168.1.15 <0.6> 2014-05-24 13:28:06 Telcontar kernel - - - [10768.630545] netpoll: netconsole: remote ethernet address 00:03:0d:05:17:fc <0.6> 2014-05-24 13:28:06 Telcontar kernel - - - [10768.635653] netconsole: network logging started On the receiving computer, I have: netcat -u -l 6666 | tee -a remote_log I plugged a usb stick, and got the messages on the remote, so good! Now I go for testing and crashing the machine again. Nvidia is not in the list. Wish me luck! - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlOAhY0ACgkQja8UbcUWM1zq5wD/ZwFIUULDj9g5WQeFMIQkq/ug x5wa9FceTZcQnLsZM9UBAJ+pqtNJ/fGvTGREbYBXiIg0xzlaBW8Q8h2t5snNfyYc =M9tY -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-05-24 13:42, Carlos E. R. wrote:
On 2014-05-24 04:22, Cristian Rodríguez wrote:
Now I go for testing and crashing the machine again. Nvidia is not in the list. Wish me luck!
Got it! [18485.930327] REISERFS (device sda9): checking transaction log (sda9) [18485.942553] REISERFS (device sda9): Using r5 hash to sort names [18621.632751] REISERFS (device sda10): found reiserfs format "3.6" with standard journal [18621.640661] REISERFS (device sda10): using ordered data mode [18621.648604] reiserfs: using flush barriers [18621.661794] REISERFS (device sda10): journal params: device sda10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 [18621.662161] REISERFS (device sda10): checking transaction log (sda10) [18621.678283] REISERFS (device sda10): Using r5 hash to sort names [18756.821718] REISERFS (device sda11): found reiserfs format "3.6" with standard journal [18756.829626] REISERFS (device sda11): using ordered data mode [18756.837575] reiserfs: using flush barriers [18756.847409] REISERFS (device sda11): journal params: device sda11, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 [18756.847737] REISERFS (device sda11): checking transaction log (sda11) [18756.865756] REISERFS (device sda11): Using r5 hash to sort names [18892.976044] BUG: unable to handle kernel paging request at ffffc90012825250 [18892.977001] IP: [<ffffffff8105e7a9>] get_next_timer_interrupt+0xa9/0x270 [18892.977001] PGD 23f027067 PUD 23f028067 PMD 22b52e067 PTE 0 [18892.977001] Oops: 0000 [#1] PREEMPT SMP [18892.977001] Modules linked in: netconsole md5 nfnetlink_log nfnetlink bluetooth rfkill usb_storage configfs xt_tcpudp xt_pkttype xt_LOG xt_limit xt_owner nfsd lockd nfs_acl auth_rpcgss sunrpc oid_registry af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_ftp nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc adi ns558 gameport joydev f71882fg xts gf128mul dm_crypt reiserfs raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx uvcvideo videobuf2_core videodev videobuf2_vmalloc videobuf2_memops iTCO_wdt gpio_ich iTCO_vendor_support coretemp kvm_intel kvm pcspkr serio_raw sr_mod cdrom snd_hda_codec_ca0110 i2c_i801 firewire_ohci snd_hda_intel firewire_core crc_itu_t lpc_ich mfd_core snd_hda_codec snd_hwdep snd_pcm_oss acpi_cpufreq r8169 mii mperf snd_pcm snd_page_alloc sh p chp floppy button sg snd_seq snd_timer snd_seq_device snd_mixer_oss snd soundcore dm_mod autofs4 xfs btrfs raid6_pq zlib_deflate xor libcrc32c nvidiafb fb_ddc i2c_algo_bit vgastate processor thermal_sys scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh ata_generic ata_piix pata_jmicron [last unloaded: netconsole] [18893.025652] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.10-11-desktop #1 [18893.025652] Hardware name: MICRO-STAR INTERNATIONAL CO.,LTD MS-7516/MS-7516, BIOS V1.5 10/10/2008 [18893.025652] task: ffff880234d58680 ti: ffff880234d5a000 task.ti: ffff880234d5a000 [18893.025652] RIP: 0010:[<ffffffff8105e7a9>] [<ffffffff8105e7a9>] get_next_timer_interrupt+0xa9/0x270 [18893.025652] RSP: 0018:ffff880234d5be30 EFLAGS: 00010006 [18893.025652] RAX: ffffc90012825238 RBX: 00000001011bb4d0 RCX: 00000000000000d1 [18893.025652] RDX: ffff880234d88e98 RSI: 00000000000000e7 RDI: 00000001011bb4d1 [18893.025652] RBP: ffff880234d5be78 R08: 0000000000000001 R09: 0000000000000000 [18893.025652] R10: ffff88023fc80000 R11: 0000000000000000 R12: 00000001011bb4d0 [18893.025652] R13: 00000001411bb4cf R14: ffff880234d88000 R15: 00000001011bb4d0 [18893.025652] FS: 0000000000000000(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000 [18893.025652] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [18893.025652] CR2: ffffc90012825250 CR3: 0000000233647000 CR4: 00000000000407e0 [18893.025652] Stack: [18893.025652] 0000000009ffa608 0000000000000000 ffff880234d58680 ffffffff8107beb4 [18893.025652] 0000000000000001 00000001011bb4d0 ffff88023fc8ce40 0000000000000000 [18893.025652] ffff88023fc8df60 0000112edd1fd341 ffffffff810ab948 00000001011bb4d0 [18893.025652] Call Trace: [18893.025652] [<ffffffff810ab948>] __tick_nohz_idle_enter+0x2d8/0x4a0 [18893.025652] [<ffffffff810abb44>] tick_nohz_idle_enter+0x34/0x60 [18893.025652] [<ffffffff810a16b5>] cpu_startup_entry+0x85/0x2f0 [18893.025652] [<ffffffff8102d8e8>] start_secondary+0x218/0x2c0 [18893.025652] Code: 18 49 8b 7e 10 48 39 df 49 89 df 78 ca 40 0f b6 cf 89 ce 48 63 c6 48 c1 e0 04 49 8d 14 06 48 8b 42 28 48 83 c2 28 48 39 d0 74 0e <f6> 40 18 01 74 21 48 8b 00 48 39 d0 75 f2 83 c6 01 40 0f b6 f6 [18893.025652] RIP [<ffffffff8105e7a9>] get_next_timer_interrupt+0xa9/0x270 [18893.025652] RSP <ffff880234d5be30> [18893.025652] CR2: ffffc90012825250 [18893.025652] ---[ end trace d391c95166177944 ]--- [18893.025652] Kernel panic - not syncing: Attempted to kill the idle task! [18893.327672] ------------[ cut here ]------------ [18893.328671] WARNING: CPU: 1 PID: 0 at /home/abuild/rpmbuild/BUILD/kernel-desktop-3.11.10/linux-3.11/arch/x86/kernel/smp.c:124 update_process_times+0x5a/0x70() [18893.328671] Modules linked in: netconsole md5 nfnetlink_log nfnetlink bluetooth rfkill usb_storage configfs xt_tcpudp xt_pkttype xt_LOG xt_limit xt_owner nfsd lockd nfs_acl auth_rpcgss sunrpc oid_registry af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_ftp nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc adi ns558 gameport joydev f71882fg xts gf128mul dm_crypt reiserfs raid456 async_raid6_recov async_pq async_xor async_memcpy async_tx uvcvideo videobuf2_core videodev videobuf2_vmalloc videobuf2_memops iTCO_wdt gpio_ich iTCO_vendor_support coretemp kvm_intel kvm pcspkr serio_raw sr_mod cdrom snd_hda_codec_ca0110 i2c_i801 firewire_ohci snd_hda_intel firewire_core crc_itu_t lpc_ich mfd_core snd_hda_codec snd_hwdep snd_pcm_oss acpi_cpufreq r8169 mii mperf snd_pcm snd_page_alloc sh p chp floppy button sg snd_seq snd_timer snd_seq_device snd_mixer_oss snd soundcore dm_mod autofs4 xfs btrfs raid6_pq zlib_deflate xor libcrc32c nvidiafb fb_ddc i2c_algo_bit vgastate processor thermal_sys scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh ata_generic ata_piix pata_jmicron [last unloaded: netconsole] [18893.328671] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 3.11.10-11-desktop #1 [18893.328671] Hardware name: MICRO-STAR INTERNATIONAL CO.,LTD MS-7516/MS-7516, BIOS V1.5 10/10/2008 [18893.328671] 0000000000000009 ffffffff815a0252 0000000000000000 ffffffff81050eb2 [18893.328671] ffff880234d58680 0000000000000000 0000000000000001 ffff88023fc8d780 [18893.328671] ffff88023fc83f68 ffffffff8105e9ca ffff880234d5baa8 0000112ef21584c2 [18893.328671] Call Trace: [18893.328671] [<ffffffff81004a28>] dump_trace+0x88/0x310 [18893.328671] [<ffffffff81004d80>] show_stack_log_lvl+0xd0/0x1d0 [18893.328671] [<ffffffff810061bc>] show_stack+0x1c/0x50 [18893.328671] [<ffffffff815a0252>] dump_stack+0x50/0x89 [18893.328671] [<ffffffff81050eb2>] warn_slowpath_common+0x72/0x90 [18893.328671] [<ffffffff8105e9ca>] update_process_times+0x5a/0x70 [18893.328671] [<ffffffff810ab53b>] tick_sched_handle.isra.15+0x1b/0x60 [18893.328671] [<ffffffff810ab5b7>] tick_sched_timer+0x37/0x60 [18893.328671] [<ffffffff81075174>] __run_hrtimer+0x64/0x270 [18893.328671] [<ffffffff81075a21>] hrtimer_interrupt+0xf1/0x230 [18893.328671] [<ffffffff8102f866>] smp_apic_timer_interrupt+0x36/0x50 [18893.328671] [<ffffffff815aec9d>] apic_timer_interrupt+0x6d/0x80 [18893.328671] [<ffffffff8159c41f>] panic+0x18e/0x1d2 [18893.328671] [<ffffffff81053952>] do_exit+0x9b2/0xa90 [18893.328671] [<ffffffff815a805a>] oops_end+0x9a/0xe0 [18893.328671] [<ffffffff8159bde1>] no_context+0x257/0x264 [18893.328671] [<ffffffff815aa356>] __do_page_fault+0x316/0x550 [18893.328671] [<ffffffff815a7538>] page_fault+0x28/0x30 [18893.328671] [<ffffffff8105e7a9>] get_next_timer_interrupt+0xa9/0x270 [18893.328671] [<ffffffff810ab948>] __tick_nohz_idle_enter+0x2d8/0x4a0 [18893.328671] [<ffffffff810abb44>] tick_nohz_idle_enter+0x34/0x60 [18893.328671] [<ffffffff810a16b5>] cpu_startup_entry+0x85/0x2f0 [18893.328671] [<ffffffff8102d8e8>] start_secondary+0x218/0x2c0 [18893.328671] ---[ end trace d391c95166177945 ]--- And syslog, from a remote xterm, till it crashed: <0.4> 2014-05-24 15:38:58 Telcontar kernel - - - [18621.648604] reiserfs: using flush barriers <0.5> 2014-05-24 15:38:58 Telcontar kernel - - - [18621.661794] REISERFS (device sda10): journal params: device sda10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 <0.5> 2014-05-24 15:38:58 Telcontar kernel - - - [18621.662161] REISERFS (device sda10): checking transaction log (sda10) <0.5> 2014-05-24 15:38:58 Telcontar kernel - - - [18621.678283] REISERFS (device sda10): Using r5 hash to sort names <3.6> 2014-05-24 15:39:01 Telcontar systemd 1 - - Starting Session 168 of user news. <9.6> 2014-05-24 15:39:01 Telcontar 2024 - - (news) CMD (/var/lib/news/bin/cronscriptparafetchnews) <3.6> 2014-05-24 15:40:01 Telcontar systemd 1 - - Starting Session 169 of user cer. <3.6> 2014-05-24 15:40:01 Telcontar systemd 1 - - Starting Session 170 of user news. <9.6> 2014-05-24 15:40:01 Telcontar 3438 - - (news) CMD (/var/lib/news/bin/cronscriptparafetchnews) <3.6> 2014-05-24 15:41:01 Telcontar systemd 1 - - Starting Session 171 of user news. <9.6> 2014-05-24 15:41:01 Telcontar 4889 - - (news) CMD (/var/lib/news/bin/cronscriptparafetchnews) <0.5> 2014-05-24 15:41:13 Telcontar kernel - - - [18756.821718] REISERFS (device sda11): found reiserfs format "3.6" with standard journal <0.5> 2014-05-24 15:41:13 Telcontar kernel - - - [18756.829626] REISERFS (device sda11): using ordered data mode <0.4> 2014-05-24 15:41:13 Telcontar kernel - - - [18756.837575] reiserfs: using flush barriers <0.5> 2014-05-24 15:41:14 Telcontar kernel - - - [18756.847409] REISERFS (device sda11): journal params: device sda11, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 <0.5> 2014-05-24 15:41:14 Telcontar kernel - - - [18756.847737] REISERFS (device sda11): checking transaction log (sda11) <0.5> 2014-05-24 15:41:14 Telcontar kernel - - - [18756.865756] REISERFS (device sda11): Using r5 hash to sort names <3.6> 2014-05-24 15:42:01 Telcontar systemd 1 - - Starting Session 172 of user news. <9.6> 2014-05-24 15:42:01 Telcontar 6296 - - (news) CMD (/var/lib/news/bin/cronscriptparafetchnews) <3.6> 2014-05-24 15:43:01 Telcontar systemd 1 - - Starting Session 173 of user news. <9.6> 2014-05-24 15:43:01 Telcontar 7623 - - (news) CMD (/var/lib/news/bin/cronscriptparafetchnews) I can not correlate, yet, if the crash happened at the same time as the news cron job ran, I'll check that later, hopefully, when I reboot the machine. But that will have to wait some hours, I have to go out in minutes. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlOApCgACgkQja8UbcUWM1wbfwD8CZq4l+h+LIP44vHSUiDMhHwN jfy2gW+ea3GXnKR1IKEA/1hqNteBuZdSKUbiakjgaQ4oH5fuMQYtrlfFk5UFp0tV =3lPY -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2014-05-24 15:52, Carlos E. R. wrote:
On 2014-05-24 13:42, Carlos E. R. wrote:
On 2014-05-24 04:22, Cristian Rodríguez wrote:
Now I go for testing and crashing the machine again. Nvidia is not in the list. Wish me luck!
Got it!
[18892.976044] BUG: unable to handle kernel paging request at ffffc90012825250 [18892.977001] IP: [<ffffffff8105e7a9>] get_next_timer_interrupt+0xa9/0x270 [18892.977001] PGD 23f027067 PUD 23f028067 PMD 22b52e067 PTE 0 [18892.977001] Oops: 0000 [#1] PREEMPT SMP
Well, it has been reported as Bug 879778, and the kernel is not tainted. My duty has been fulfilled, and my intention is now to format that disk for usage with XFS, I think, so I will no longer be able to run this crash test again. I'll wait some days (I'll try not to update anything), and if I hear nothing, I'll go ahead. If somebody wants me to run some test on it, speak soon :-) (I mention this because sometimes I get the first response to a Bugzilla after 8 months of my initial report. I can not wait that long with the disk idle). - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlOBOdgACgkQtTMYHG2NR9ULtACff8thTgn+WCjoLRNwKA8kT0zb 8zMAnjICUPTNq5uHUJiOaqhQzIGw5eD4 =p0pu -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 В Sat, 24 May 2014 00:07:00 +0200 "Carlos E. R." <carlos.e.r@opensuse.org> пишет:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 2014-05-23 22:38, Cristian Rodríguez wrote:
El 23/05/14 16:31, Carlos E. R. escribió:
On 2014-05-23 19:00, Andrey Borzenkov wrote:
В Fri, 23 May 2014 11:32:53 +0200 (CEST) "Carlos E. R." <> пишет:
Did you report it to bugzilla?
Not yet. First, because I was waiting for comments,
Whenever userspace triggers a kernel crash it is a bug that needs to be fixed. The only exception is when a particular feature is designed to purposely crash the kernel ;-)
I got the crash and error dump on tty10. While I was looking at it, I noticed that the mouse (gpm) still responded. Then I got a new page of more messages, and the screen locked hard. I can not get at the first error messages :-/
If you know of a way to continuously dump a copy of kernel messages to another machine via network, that doesn't die too fast, I'm listening.
Did you try netconsole? You have to understand that dumping over network requires fair amount of working kernel code, so you are likely to lose latest information anyway. I think debug console over firewire is also possible. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlOB7UwACgkQR6LMutpd94xEkACgmf5TRQusUbHTLo8MaDE1YYbm jQUAoK5Ct4uUYu0w9TlUhgYITWrCZ27I =lBvI -----END PGP SIGNATURE----- N�����r��y隊Z)z{.�ﮞ˛���m�)z{.��+�:�{Zr�az�'z��j)h���Ǿ� ޮ�^�ˬz��
On 2014-05-25 15:16, Andrey Borzenkov wrote:
В Sat, 24 May 2014 00:07:00 +0200 "Carlos E. R." <> пишет:
If you know of a way to continuously dump a copy of kernel messages to another machine via network, that doesn't die too fast, I'm listening.
Did you try netconsole? You have to understand that dumping over network requires fair amount of working kernel code, so you are likely to lose latest information anyway.
You have lost some messages here, because that is just what I did, with help from Cristian. The documentation about netconsole is wrong, by the way. I wrote a bugzilla on this (my issue, not netconsole). I got the Oops message from the kernel, and not tainted.
I think debug console over firewire is also possible.
Ah, didn't knew about that, but I only have one computer with that connector, disconnected, I believe, and no interconnect wire. I don't have any gadget using that wire, so I have no use for it. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
participants (6)
-
Andrey Borzenkov
-
auxsvr@gmail.com
-
Carlos E. R.
-
Carlos E. R.
-
Carlos E. R.
-
Cristian Rodríguez