[opensuse] Xserver crashed unexpectedly
Wonder if anyone can help me to understand what this crash was exactly and what might have caused it? If you look on the line timed 18:17:36 you can see that KDM terminated. The screen went blank, restarted and went to the login screen. I logged in and it was business as usual. I've used this card for a year and looking back in the system logs shows these temps way back in Jan 2008. Am I reading those temps correctly? What is the 190, 194, 195 etc before the celcius reading? I don't think the card overheated. What else is needed to help with an answer? The slow-down threshold on my video card is factory set at 105 degrees C. The video card is an Nvidia 8600GT. Operating System is Opensuse 10.3 Linux-x86_64 and Nvidia driver version is 169.09. I just checked the fan on the video card and it is purring along as usual. I cleaned all the dust bunnies out of the computer a month or so back. Thanks for any and all help on this...... Dec 3 15:09:44 linux-t9t6 kernel: sd 10:0:0:3: [sdf] Attached SCSI removable disk Dec 3 15:09:44 linux-t9t6 kernel: sd 10:0:0:3: Attached scsi generic sg6 type 0 Dec 3 15:09:44 linux-t9t6 kernel: usb-storage: device scan complete Dec 3 15:09:45 linux-t9t6 hald: mounted /dev/sdc1 on behalf of uid 1000 Dec 3 15:31:15 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 190 Temperature_Celsius changed from 63 to 61 Dec 3 15:31:15 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 194 Temperature_Celsius changed from 37 to 39 Dec 3 15:31:41 linux-t9t6 syslog-ng[2326]: STATS: dropped 0 Dec 3 15:36:11 linux-t9t6 hald: unmounted /dev/sdc1 from '/media/NIKON D300' on behalf of uid 1000 Dec 3 16:01:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 190 Temperature_Celsius changed from 61 to 62 Dec 3 16:01:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 194 Temperature_Celsius changed from 39 to 38 Dec 3 16:07:16 linux-t9t6 kernel: sr0: CDROM not ready. Make sure there is a disc in the drive. Dec 3 16:07:16 linux-t9t6 kernel: sr0: CDROM not ready. Make sure there is a disc in the drive. Dec 3 16:31:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 190 Temperature_Celsius changed from 62 to 59 Dec 3 16:31:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 194 Temperature_Celsius changed from 38 to 41 Dec 3 16:31:41 linux-t9t6 syslog-ng[2326]: STATS: dropped 0 Dec 3 17:31:41 linux-t9t6 syslog-ng[2326]: STATS: dropped 0 Dec 3 18:00:51 linux-t9t6 kernel: sr0: CDROM not ready. Make sure there is a disc in the drive. Dec 3 18:00:51 linux-t9t6 kernel: sr0: CDROM not ready. Make sure there is a disc in the drive. Dec 3 18:17:22 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 00005097 000015e0 00000000 00000100 Dec 3 18:17:22 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 00005097 000015e0 00000000 00000000 Dec 3 18:17:26 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000100 Dec 3 18:17:26 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000000 Dec 3 18:17:30 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000100 Dec 3 18:17:30 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000000 Dec 3 18:17:34 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000100 Dec 3 18:17:34 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000000 Dec 3 18:17:36 linux-t9t6 kdm[2517]: X server for display :0 terminated unexpectedly Dec 3 18:31:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 190 Temperature_Celsius changed from 59 to 61 Dec 3 18:31:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 194 Temperature_Celsius changed from 41 to 39 Dec 3 18:31:41 linux-t9t6 syslog-ng[2326]: STATS: dropped 0 Dec 3 19:01:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 77 to 76 Dec 3 19:31:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 190 Temperature_Celsius changed from 61 to 62 Dec 3 19:31:14 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 194 Temperature_Celsius changed from 39 to 38 Dec 3 19:31:41 linux-t9t6 syslog-ng[2326]: STATS: dropped 0 Dec 3 19:57:23 linux-t9t6 sudo: leeross : TTY=pts/4 ; PWD=/home/leeross ; USER=root ; COMMAND=/opt/kde3/bin/kdesu_stub - Dec 3 19:57:23 linux-t9t6 sudo: leeross : TTY=pts/4 ; PWD=/home/leeross ; USER=root ; COMMAND=/opt/kde3/bin/kdesu_stub - Dec 3 20:14:40 linux-t9t6 kernel: sd 10:0:0:0: [sdc] 4001760 512-byte hardware sectors (2049 MB) Dec 3 20:14:40 linux-t9t6 kernel: sd 10:0:0:0: [sdc] Write Protect is off Dec 3 20:14:40 linux-t9t6 kernel: sd 10:0:0:0: [sdc] Mode Sense: 02 00 00 00 Dec 3 20:14:40 linux-t9t6 kernel: sd 10:0:0:0: [sdc] Assuming drive cache: write through Dec 3 20:14:40 linux-t9t6 kernel: sd 10:0:0:0: [sdc] 4001760 512-byte hardware sectors (2049 MB) Dec 3 20:14:40 linux-t9t6 kernel: sd 10:0:0:0: [sdc] Write Protect is off Dec 3 20:14:40 linux-t9t6 kernel: sd 10:0:0:0: [sdc] Mode Sense: 02 00 00 00 Dec 3 20:14:40 linux-t9t6 kernel: sd 10:0:0:0: [sdc] Assuming drive cache: write through Dec 3 20:14:40 linux-t9t6 kernel: sdc: sdc1 Dec 3 20:14:41 linux-t9t6 hald: mounted /dev/sdc1 on behalf of uid 1000 Dec 3 20:15:20 linux-t9t6 hald: unmounted /dev/sdc1 from '/media/NIKON D300' on behalf of uid 1000 Dec 3 20:31:41 linux-t9t6 syslog-ng[2326]: STATS: dropped 0 -- Lee Ross Anchorage, Ak -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday, 2008-12-04 at 00:04 -0900, Lee Ross wrote:
I logged in and it was business as usual. I've used this card for a year and looking back in the system logs shows these temps way back in Jan 2008. Am I reading those temps correctly? What is the 190, 194, 195 etc before the celcius reading? I don't think the card overheated.
You mean:
Dec 3 15:31:15 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 190 Temperature_Celsius changed from 63 to 61 Dec 3 15:31:15 linux-t9t6 smartd[4180]: Device: /dev/sda, SMART Usage Attribute: 194 Temperature_Celsius changed from 37 to 39
That's the hard disk SMART monitor. Look up "man smartctl". This other messages might be related to video:
Dec 3 18:17:22 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 00005097 000015e0 00000000 00000100 Dec 3 18:17:22 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 00005097 000015e0 00000000 00000000 ... Dec 3 18:17:36 linux-t9t6 kdm[2517]: X server for display :0 terminated unexpectedly
- -- Cheers, Carlos E. R. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAkk3wHAACgkQtTMYHG2NR9XAsACfU4+ISVAZmRd9nru26f/pRSmo qUsAn1LAF74DNhT9ktZphpxHKvFROwY1 =6JKp -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Thursday 04 December 2008, Lee Ross wrote:
Dec 3 18:17:22 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 00005097 000015e0 00000000 00000100 Dec 3 18:17:22 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 00005097 000015e0 00000000 00000000 Dec 3 18:17:26 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000100 Dec 3 18:17:26 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000000 Dec 3 18:17:30 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000100 Dec 3 18:17:30 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000000 Dec 3 18:17:34 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000100 Dec 3 18:17:34 linux-t9t6 kernel: NVRM: Xid (0005:00): 13, 0001 00000000 0000502d 00000214 00008000 00000000
I started getting messages like this on my previous machine. My machine started freezing, forcing me to use the reset button. Also got some kernel panics. I looked into it pretty deeply and I believe the cause to be that my video card developed some bad memory. After about 5 years of 24/7 service, that should not be surprising, I suppose. You might want to keep an eye on your video card. If this sort of thing keeps happening, it may be time to replace it. Joop
participants (3)
-
Carlos E. R.
-
Joop Beris
-
Lee Ross