[opensuse] Workstation randomly freezing!
Hello All, I have a brand new HP Z240 workstation running Leap and it randomly freezes, becomes fully unresponsive, both locally and over the network, and I'm forced to power cycle the system. This has been happening multiple time a day with no rime or reason, sometimes it's sitting idle after a reboot, reading email, whatever. The most frustrating part about it is there's never anything in the logs to indicate a reason for this. It's the default installation, BTRFS partitioning scheme, XFS for /home, KDE5, nothing crazy at all. I've installed and configured kdump with the hope it will be able to provide some insight after the next occurrence, but given there are no kernel crash messages in the logs it's hard to say whether it will help. I guess time will tell. Does anyone have any thoughts further debugging this type of an issue? -- Later, Darin -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 15/04/2016 12:45 AM, Darin Perusich wrote:
Does anyone have any thoughts further debugging this type of an issue?
start disabling things, I'd start with the power saving settings - sleep, hibernate, dim screen etc. -- Lindsay Mathieson -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, 14 Apr 2016 16:45:31 +0200, Darin Perusich wrote:
Hello All,
I have a brand new HP Z240 workstation running Leap and it randomly freezes, becomes fully unresponsive, both locally and over the network, and I'm forced to power cycle the system. This has been happening multiple time a day with no rime or reason, sometimes it's sitting idle after a reboot, reading email, whatever. The most frustrating part about it is there's never anything in the logs to indicate a reason for this.
It's the default installation, BTRFS partitioning scheme, XFS for /home, KDE5, nothing crazy at all.
I've installed and configured kdump with the hope it will be able to provide some insight after the next occurrence, but given there are no kernel crash messages in the logs it's hard to say whether it will help. I guess time will tell.
Does anyone have any thoughts further debugging this type of an issue?
It's a Skylake system, right? Skylake graphics is weakly supported by Leap kernel. Try the latest kernel in OBS Kernel:openSUSE-42.1 repo, at least. I guess 4.1.20 was released in the update repo recently, too, so this should be OK, too. Also, it might be safer to disable intel_idle and intel_pstate drivers, e.g. pass options intel_idle.max_cstate=0 intel_pstate=disable In anyway, feel free to open the bug report. Don't forget to attach the output of "hwinfo --all" and the kernel messages (dmesg output) for a while after a fresh boot. Takashi -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, Apr 14, 2016 at 11:21 AM, Takashi Iwai <tiwai@suse.de> wrote:
On Thu, 14 Apr 2016 16:45:31 +0200, Darin Perusich wrote:
Hello All,
I have a brand new HP Z240 workstation running Leap and it randomly freezes, becomes fully unresponsive, both locally and over the network, and I'm forced to power cycle the system. This has been happening multiple time a day with no rime or reason, sometimes it's sitting idle after a reboot, reading email, whatever. The most frustrating part about it is there's never anything in the logs to indicate a reason for this.
It's the default installation, BTRFS partitioning scheme, XFS for /home, KDE5, nothing crazy at all.
I've installed and configured kdump with the hope it will be able to provide some insight after the next occurrence, but given there are no kernel crash messages in the logs it's hard to say whether it will help. I guess time will tell.
Does anyone have any thoughts further debugging this type of an issue?
It's a Skylake system, right? Skylake graphics is weakly supported by Leap kernel. Try the latest kernel in OBS Kernel:openSUSE-42.1 repo, at least. I guess 4.1.20 was released in the update repo recently, too, so this should be OK, too.
Yes this is a Skylack system and it's running the 4.1.20-11-default, and I see that 4.1.21-2.1 is the version in OBS Kernel:openSUSE-42.1 .
Also, it might be safer to disable intel_idle and intel_pstate drivers, e.g. pass options intel_idle.max_cstate=0 intel_pstate=disable
I'll give these options a try before installing the newer kernel.
In anyway, feel free to open the bug report. Don't forget to attach the output of "hwinfo --all" and the kernel messages (dmesg output) for a while after a fresh boot.
I was planning on this but wanted to wait until I had a few work-arounds.
Takashi -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, Apr 14, 2016 at 12:19 PM, Darin Perusich <darin@darins.net> wrote:
Also, it might be safer to disable intel_idle and intel_pstate drivers, e.g. pass options intel_idle.max_cstate=0 intel_pstate=disable
I'll give these options a try before installing the newer kernel.
Setting these kernel parameters had no effect, the system just stopped responding and needed a power cycle. Unfortunately kdump didn't capture anything. -- Later, Darin -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, 14 Apr 2016 22:15:40 +0200, Darin Perusich wrote:
On Thu, Apr 14, 2016 at 12:19 PM, Darin Perusich <darin@darins.net> wrote:
Also, it might be safer to disable intel_idle and intel_pstate drivers, e.g. pass options intel_idle.max_cstate=0 intel_pstate=disable
I'll give these options a try before installing the newer kernel.
Setting these kernel parameters had no effect, the system just stopped responding and needed a power cycle. Unfortunately kdump didn't capture anything.
You may try also i915-quickfix KMP in OBS home:tiwai:bnc974884 repo. It has a couple of backports. Just install i915-quickfix-kmp-default.rpm from that repo, reboot and retest. Speaking of kdump: you should test kdump manually beforehand, just to check whether it works at all. YaST2 kdump setup is often too tight and fails when KMS is used. Run echo -n c > /proc/sysrq-trigger and see whether you get a proper kdump. Takashi -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, Apr 14, 2016 at 4:27 PM, Takashi Iwai <tiwai@suse.de> wrote:
On Thu, 14 Apr 2016 22:15:40 +0200, Darin Perusich wrote:
On Thu, Apr 14, 2016 at 12:19 PM, Darin Perusich <darin@darins.net> wrote:
Also, it might be safer to disable intel_idle and intel_pstate drivers, e.g. pass options intel_idle.max_cstate=0 intel_pstate=disable
I'll give these options a try before installing the newer kernel.
Setting these kernel parameters had no effect, the system just stopped responding and needed a power cycle. Unfortunately kdump didn't capture anything.
You may try also i915-quickfix KMP in OBS home:tiwai:bnc974884 repo. It has a couple of backports. Just install i915-quickfix-kmp-default.rpm from that repo, reboot and retest.
Installed and rebooted. I'll add any comments to bugzilla.
Speaking of kdump: you should test kdump manually beforehand, just to check whether it works at all. YaST2 kdump setup is often too tight and fails when KMS is used. Run echo -n c > /proc/sysrq-trigger and see whether you get a proper kdump.
The kdump failed due to not enough memory so I'll tweak the values.
Takashi -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, Apr 14, 2016 at 4:27 PM, Takashi Iwai <tiwai@suse.de> wrote:
On Thu, 14 Apr 2016 22:15:40 +0200, Darin Perusich wrote:
On Thu, Apr 14, 2016 at 12:19 PM, Darin Perusich <darin@darins.net> wrote:
Also, it might be safer to disable intel_idle and intel_pstate drivers, e.g. pass options intel_idle.max_cstate=0 intel_pstate=disable
I'll give these options a try before installing the newer kernel.
Setting these kernel parameters had no effect, the system just stopped responding and needed a power cycle. Unfortunately kdump didn't capture anything.
You may try also i915-quickfix KMP in OBS home:tiwai:bnc974884 repo. It has a couple of backports. Just install i915-quickfix-kmp-default.rpm from that repo, reboot and retest.
These KMP did not resolve the issue, the system stopped responding sometime last night. Will file a bug report. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Fri, 15 Apr 2016 15:31:46 +0200, Darin Perusich wrote:
On Thu, Apr 14, 2016 at 4:27 PM, Takashi Iwai <tiwai@suse.de> wrote:
On Thu, 14 Apr 2016 22:15:40 +0200, Darin Perusich wrote:
On Thu, Apr 14, 2016 at 12:19 PM, Darin Perusich <darin@darins.net> wrote:
Also, it might be safer to disable intel_idle and intel_pstate drivers, e.g. pass options intel_idle.max_cstate=0 intel_pstate=disable
I'll give these options a try before installing the newer kernel.
Setting these kernel parameters had no effect, the system just stopped responding and needed a power cycle. Unfortunately kdump didn't capture anything.
You may try also i915-quickfix KMP in OBS home:tiwai:bnc974884 repo. It has a couple of backports. Just install i915-quickfix-kmp-default.rpm from that repo, reboot and retest.
These KMP did not resolve the issue, the system stopped responding sometime last night. Will file a bug report.
OK, I haven't expected it much, as the patches are only relevant with DP MST. In anyway, it'd be helpful if you can get the crash dump, at least, the kernel message at crash. Otherwise it's difficult to know where to start from. Takashi -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Fri, Apr 15, 2016 at 9:36 AM, Takashi Iwai <tiwai@suse.de> wrote:
On Fri, 15 Apr 2016 15:31:46 +0200, Darin Perusich wrote:
On Thu, Apr 14, 2016 at 4:27 PM, Takashi Iwai <tiwai@suse.de> wrote:
On Thu, 14 Apr 2016 22:15:40 +0200, Darin Perusich wrote:
On Thu, Apr 14, 2016 at 12:19 PM, Darin Perusich <darin@darins.net> wrote:
Also, it might be safer to disable intel_idle and intel_pstate drivers, e.g. pass options intel_idle.max_cstate=0 intel_pstate=disable
I'll give these options a try before installing the newer kernel.
Setting these kernel parameters had no effect, the system just stopped responding and needed a power cycle. Unfortunately kdump didn't capture anything.
You may try also i915-quickfix KMP in OBS home:tiwai:bnc974884 repo. It has a couple of backports. Just install i915-quickfix-kmp-default.rpm from that repo, reboot and retest.
These KMP did not resolve the issue, the system stopped responding sometime last night. Will file a bug report.
OK, I haven't expected it much, as the patches are only relevant with DP MST. In anyway, it'd be helpful if you can get the crash dump, at least, the kernel message at crash. Otherwise it's difficult to know where to start from.
https://bugzilla.opensuse.org/show_bug.cgi?id=975780 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
We had that happen with a server once, and what we ended up doing was leaving it logged in as root and running with the console 10 screen up (ctrl+alt+F10.) That way we could see what the last messages were when it hung. Turned out that the server would randomly spike up in memory usage on a particular process, and would run out of memory, but it wouldn't log anything to the logs about that. So after adding more memory, all was fine and dandy :) Chris -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 04/14/2016 08:23 AM, Christopher Myers wrote:
We had that happen with a server once, and what we ended up doing was leaving it logged in as root and running with the console 10 screen up (ctrl+alt+F10.) That way we could see what the last messages were when it hung.
Turned out that the server would randomly spike up in memory usage on a particular process, and would run out of memory, but it wouldn't log anything to the logs about that. So after adding more memory, all was fine and dandy :)
Chris
You would see that happening. It wouldn't be a sudden freeze. Your drive would be thrashing madly, your swap would be filling, and if it couldn't handle an allocate call, something would crash. It would slowly freeze, but some thing would still work. Capslock, numlock would still indicate proper state as you pressed those keys, etc. Memory usage spikes are not random. There was some other underlying problems. Adding memory is just kicking that can down the road a ways. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On different hardware with a similar problem I found KDE was unable to deal with the new hardware. When I went to Gnome 3.16 from the install DVD, then the freezes stopped. Allen --- It is the thoughts we don't have that get us in life. +++++++ On Thu, 14 Apr 2016, Darin Perusich wrote:
Hello All,
I have a brand new HP Z240 workstation running Leap and it randomly freezes, becomes fully unresponsive, both locally and over the network, and I'm forced to power cycle the system. This has been happening multiple time a day with no rime or reason, sometimes it's sitting idle after a reboot, reading email, whatever. The most frustrating part about it is there's never anything in the logs to indicate a reason for this.
It's the default installation, BTRFS partitioning scheme, XFS for /home, KDE5, nothing crazy at all.
I've installed and configured kdump with the hope it will be able to provide some insight after the next occurrence, but given there are no kernel crash messages in the logs it's hard to say whether it will help. I guess time will tell.
Does anyone have any thoughts further debugging this type of an issue?
-- Later, Darin -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 04/14/2016 07:45 AM, Darin Perusich wrote:
I have a brand new HP Z240 workstation running Leap and it randomly freezes, becomes fully unresponsive, both locally and over the network, and I'm forced to power cycle the system.
More symptoms please. Heavy disk activity leading up to the crash? Do you have a swap file? Any flashing keyboard lights? Kernel Panic? If no flashing lights, does caps-lock key still turn on the caps lock light? Does the machine have a sleep setting in bios that de-powers the NIC? -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, Apr 14, 2016 at 3:05 PM, John Andersen <jsamyth@gmail.com> wrote:
On 04/14/2016 07:45 AM, Darin Perusich wrote:
I have a brand new HP Z240 workstation running Leap and it randomly freezes, becomes fully unresponsive, both locally and over the network, and I'm forced to power cycle the system.
More symptoms please.
Heavy disk activity leading up to the crash?
none, the last time it happened the system was idle.
Do you have a swap file?
yes, it's 2Gb
Any flashing keyboard lights? Kernel Panic?
No flashing lights or kernel panic, and the caps-lock cannot be turned on|off after it freezes
If no flashing lights, does caps-lock key still turn on the caps lock light?
Does the machine have a sleep setting in bios that de-powers the NIC?
I haven't checked but I will. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
I had something similar on new "skylake" hardware. Look at this thread: https://forums.opensuse.org/showthread.php/514265-Graphics-freeze-computer-u... It has a solution that worked for me. Also this bug: https://bugzilla.novell.com/show_bug.cgi?id=971695 Might be related... On Thu, Apr 14, 2016 at 4:45 PM, Darin Perusich <darin@darins.net> wrote:
Hello All,
I have a brand new HP Z240 workstation running Leap and it randomly freezes, becomes fully unresponsive, both locally and over the network, and I'm forced to power cycle the system. This has been happening multiple time a day with no rime or reason, sometimes it's sitting idle after a reboot, reading email, whatever. The most frustrating part about it is there's never anything in the logs to indicate a reason for this.
It's the default installation, BTRFS partitioning scheme, XFS for /home, KDE5, nothing crazy at all.
I've installed and configured kdump with the hope it will be able to provide some insight after the next occurrence, but given there are no kernel crash messages in the logs it's hard to say whether it will help. I guess time will tell.
Does anyone have any thoughts further debugging this type of an issue?
-- Later, Darin -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- Met vriendelijke groet / Best regards, Wilfred van Velzen -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (7)
-
Allen Wilkinson
-
Christopher Myers
-
Darin Perusich
-
John Andersen
-
Lindsay Mathieson
-
Takashi Iwai
-
Wilfred van Velzen