Howdy, all. TLDR; what log files are important when troubleshooting an issue in TW. I'm investigating an openSUSE TW KDE machine that's locking up daily while idle. Screen frozen, keyboard lights not responding, can only force off by holding power button. Even the instant reset button on the case doesn't work. Systemd journal has seemingly nothing of significance. In a hunt to find other logs, I found in the docs a list of log files [0]. However /var/log/warn doesn't seem to exist in TW, and the others don't seem relevant, so I'm continuing a search. If there was a similar page to the Leap logs page [0] for TW, maybe on the wiki, that's be useful. Maybe I can make one with the results of this thread. Thank you in advance. [0] https://doc.opensuse.org/documentation/leap/startup/html/book-startup/cha-tr... -- -James
James Pain composed on 2021-11-08 22:15 (UTC):
TLDR; what log files are important when troubleshooting an issue in TW.
journalctl You may not discover anything useful if the problem is a failing power supply, motherboard, or RAM. Run memtest86 (not memtest86+) if you have DDR4 to ensure RAM is OK, at least 4 full passes (preferably 8 hours or more). If your power supply is out of warranty, open it up, make sure it's clean, and visually inspect for leaky and/or swollen electrolytic capacitors[1]. If the motherboard is older, it too may have bad caps. Most newer boards have polys instead of electrolytics for the important ones. [1] http://www.badcaps.net/ -- Evolution as taught in public schools is, like religion, based on faith, not based on science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata
On 08/11/2021 23.15, James Pain wrote:
Howdy, all.
TLDR; what log files are important when troubleshooting an issue in TW.
I'm investigating an openSUSE TW KDE machine that's locking up daily while idle. Screen frozen, keyboard lights not responding, can only force off by holding power button. Even the instant reset button on the case doesn't work.
Systemd journal has seemingly nothing of significance. In a hunt to find other logs, I found in the docs a list of log files [0]. However /var/log/warn doesn't seem to exist in TW, and the others don't seem relevant, so I'm continuing a search.
/var/log/warn content will be in the systemd journal. You can activate traditional syslog if you want, but the contents should be the same. Sadly, in such a crash must often you find nothing in the logs because the machine went down before anything could be written. For instance, the disk may become inaccessible (kernel crashed), or the hard disk had no time to actually write the log. Thus to investigate issues one method was to have the kernel write directly to the serial port, which goes to another machine. Works best if it is an actual hardware rs232 serial port, not one via usb - because the true serial port is directly accessible at very low level. Like when looking at the human mind functions which are described as the reptilian mind. Deep inside :-D Things to investigate are video issues, heat issues, memory issues - not in any particular order. -- Cheers / Saludos, Carlos E. R. (from oS Leap 15.2 x86_64 (Minas Tirith))
On 09.11.2021 02:31, Carlos E. R. wrote:
On 08/11/2021 23.15, James Pain wrote:
Howdy, all.
TLDR; what log files are important when troubleshooting an issue in TW.
I'm investigating an openSUSE TW KDE machine that's locking up daily while idle. Screen frozen, keyboard lights not responding, can only force off by holding power button. Even the instant reset button on the case doesn't work.
Systemd journal has seemingly nothing of significance. In a hunt to find other logs, I found in the docs a list of log files [0]. However /var/log/warn doesn't seem to exist in TW, and the others don't seem relevant, so I'm continuing a search.
/var/log/warn content will be in the systemd journal. You can activate traditional syslog if you want, but the contents should be the same.
Sadly, in such a crash must often you find nothing in the logs because the machine went down before anything could be written. For instance, the disk may become inaccessible (kernel crashed), or the hard disk had no time to actually write the log.
Thus to investigate issues one method was to have the kernel write directly to the serial port, which goes to another machine. Works best if it is an actual hardware rs232 serial port, not one via usb - because the true serial port is directly accessible at very low level. Like when looking at the human mind functions which are described as the reptilian mind. Deep inside :-D
If there is physical Ethernet port (not wireless) netconsole may be an option.
Things to investigate are video issues, heat issues, memory issues - not in any particular order.
On Tuesday 09 November 2021, James Pain wrote:
Howdy, all.
TLDR; what log files are important when troubleshooting an issue in TW.
I'm investigating an openSUSE TW KDE machine that's locking up daily while idle. Screen frozen, keyboard lights not responding, can only force off by holding power button. Even the instant reset button on the case doesn't work.
Systemd journal has seemingly nothing of significance. In a hunt to find other logs, I found in the docs a list of log files [0]. However /var/log/warn doesn't seem to exist in TW, and the others don't seem relevant, so I'm continuing a search.
If there was a similar page to the Leap logs page [0] for TW, maybe on the wiki, that's be useful. Maybe I can make one with the results of this thread.
Thank you in advance.
[0] https://doc.opensuse.org/documentation/leap/startup/html/book-startup/cha-tr...
I would agree with the other replies, if the hardware is pulling the system down, the system may not get an opportunity to write to the onboard log. However, in the event of a such a crash, I would attempt to see if I could still ssh to the machine in case just X11 has hung. I would also check: /home/foo/.local/share/sddm/xorg-session.log (replacing foo with who ever was logged in at the time) /var/log/Xorg.0.log /var/log/Xorg.0.log.old If you lack a serial port, you might try a remote ssh session and leave it tailing the logs. You could also use an ssh session to run top/htop or similar to see if anything is pushing the machines thermals while it idles - in one past non-crashing example, someone found that the KDE clock on the screen saver was burning the CPU, disabling the clock option in settings fixed that one. Cheers, Michael
On 2021-11-08 23:50, Michael Hamilton wrote:
However, in the event of a such a crash, I would attempt to see if I could still ssh to the machine in case just X11 has hung.
+1 BTDT in ancient of days. It reassured me that the kernel and network services were running. I net booted in text mode and ran the server in text mode. I found it ran reliably in text mode. Well that wolf-fenced the problem to the GUI., X11 side of things. That fix wasn't urgent and the server was needed as a server not as a test-bed. Life's like that! -- “Reality is so complex, we must move away from dogma, whether it’s conspiracy theories or free-market,” -- James Glattfelder. http://jth.ch/jbg
Thank you all for your suggestions. I'll be referring back to them. In the meantime, I've updated to the latest snapshot and, like all the best issues, the problem has magically disappeared. Thanks again! -- -James
Am 08.11.21 um 23:15 schrieb James Pain:
Howdy, all.
TLDR; what log files are important when troubleshooting an issue in TW.
I'm investigating an openSUSE TW KDE machine that's locking up daily while idle. Screen frozen, keyboard lights not responding, can only force off by holding power button. Even the instant reset button on the case doesn't work.
Systemd journal has seemingly nothing of significance. In a hunt to find other logs, I found in the docs a list of log files [0]. However /var/log/warn doesn't seem to exist in TW, and the others don't seem relevant, so I'm continuing a search.
If there was a similar page to the Leap logs page [0] for TW, maybe on the wiki, that's be useful. Maybe I can make one with the results of this thread.
Thank you in advance.
[0] https://doc.opensuse.org/documentation/leap/startup/html/book-startup/cha-tr...
If its a amd ryzon processor, there is/are known bug(s) when idle with linux (not windows). crash/locking the system. search the net or reply here. you have set inside the bios manualy some options. maybe have to update bios. !most! boards then work stable. -- www.becherer.de
On 2021-11-08 17:15, James Pain wrote:
However /var/log/warn doesn't seem to exist in TW,
And it may not be relevant. As far as I have been able to tell /var/log/warn , like /var/log/mail.warn, only logs SYSLOG code in programs that have that setting, for example on the command like. No, I don't think /var/log/warn is for global 'warning'. I do see many 'warnings' that don't seem to be SYLOG in /var/log/messages. For example 2021-11-01T08:02:25.989599-04:00 main kernel: [42014.619443] traps: chrome[12677] general protection fault ip:7fa9d5be4af4 sp:7ffca7708390 error:0 in libc-2.26.so[7fa9d5baa000+1b1000] 2021-11-01T09:42:10.377761-04:00 main dbus-daemon[1801]: [session uid=501 pid=1801] Activated service 'org.freedesktop.Notifications' failed: Process org.freedesktop.Notifications exited with status 1 Might things like that be more significant in your opinion? -- “Reality is so complex, we must move away from dogma, whether it’s conspiracy theories or free-market,” -- James Glattfelder. http://jth.ch/jbg
On 2021-11-08 17:15, James Pain wrote:
TLDR; what log files are important when troubleshooting an issue in TW.
If this was other than TW I'd ask if it was kept up to date, the latest 'zypper up'. I suppose I could ask whatever is relevant for TW, but being TW how would it matter? I've just bitten the bullet and done a proper upgrade 15.1 -> 15.2 because many of the 15.1 repositories were vanishing or failing to be maintained and many programs and libraries were differencing and crashing or failing to load because of this disparity. I was getting fearful because things like kernel_cars and other things I though were related to kernel updates were getting through but kernel updates themselves were not. How are you updating the kernel on that system? -- “Reality is so complex, we must move away from dogma, whether it’s conspiracy theories or free-market,” -- James Glattfelder. http://jth.ch/jbg
participants (7)
-
Andrei Borzenkov
-
Anton Aylward
-
Carlos E. R.
-
Felix Miata
-
James Pain
-
Michael Hamilton
-
Simon Becherer