![](https://seccdn.libravatar.org/avatar/cabdbf4d350ab6a15265803acab1634d.jpg?s=120&d=mm&r=g)
On 12/01/2020 12:13, Per Jessen wrote:
Anton Aylward wrote:
On 12/01/2020 11:51, Per Jessen wrote:
jdd@dodin.org wrote:
When the system starts to run out of memoru, the OOM killer will kick in and try to identify who is gobbling up the memory. If it pick an innocent process, it could be seen as the system crashing.
That is possible with the default setting, yes.
However you can also configure it so the OOM_killer targets the process that caused the OOM condition and only that.
Care to share an example?
The details of the what and how are in the long document on the virtual memory settings that I've mentioned here a number of times before: You have control via VM settings of what happens in OOM conditions. Sadly the default is to scan ALL processes for candidates to kill or default to a PANIC. You can, if you read though the docco I referred to, https://www.kernel.org/doc/Documentation/sysctl/vm.txt alter that. ============================================================== oom_kill_allocating_task This enables or disables killing the OOM-triggering task in out-of-memory situations. If this is set to zero, the OOM killer will scan through the entire tasklist and select a task based on heuristics to kill. This normally selects a rogue memory-hogging task that frees up a large amount of memory when killed. If this is set to non-zero, the OOM killer simply kills the task that triggered the out-of-memory condition. This avoids the expensive tasklist scan. If panic_on_oom is selected, it takes precedence over whatever value is used in oom_kill_allocating_task. The default value is 0. ============================================================== and ============================================================= panic_on_oom This enables or disables panic on out-of-memory feature. If this is set to 0, the kernel will kill some rogue process, called oom_killer. Usually, oom_killer can kill rogue processes and system will survive. If this is set to 1, the kernel panics when out-of-memory happens. However, if a process limits using nodes by mempolicy/cpusets, and those nodes become memory exhaustion status, one process may be killed by oom-killer. No panic occurs in this case. Because other nodes' memory may be free. This means system total status may be not fatal yet. If this is set to 2, the kernel panics compulsorily even on the above-mentioned. Even oom happens under memory cgroup, the whole system panics. The default value is 0. 1 and 2 are for failover of clustering. Please select either according to your policy of failover. panic_on_oom=2+kdump gives you very strong tool to investigate why oom happens. You can get snapshot. ============================================================= More to the point here, you have settings that let you ANALYSE why the OOM occured.
In my experience, the oom killer kills the _best_ process, which is the one that uses the most memory.
NOT! Suppose you have a memory intensive analytic program that goes backwards and forwards over rows & columns of data and its been running a couple of weeks ... Then you start up a small interactive program with a memory leak. There's not much merry available since the big analytic one has most of it, so this one leaks what it can get, which isn't much by comparison, and causes the OOM condition. You do not want the biggest, that analysis program, to be the one that is killed! You might wonder at the interactive one dying on you, but you definitely don't want to loose all that work!
There is way of influencing the choice, I'm not sure how.
See above
I had a situation over Christmas where a customer's webserver essentially died (I may have mentioned this situation before) due to trying to serve too many requests. Loads of apache threads gobbled up the memory and the OOM killer decided to get rid of mysql. (an innocent, but important victim).
In days of old the front end Apache started up some threads, just a few, and had them waiting for requests. After servicing a request the thread died and Apache started a new one. so the downstream code, perhaps a Perl application, also does. memory leaks associated also went away. Then some genius decided that the overhead of startup was too much so lets have long lived processes. Never mind if they have memory leaks. If your Apache is continuously spawning threads then you have a problem. It may be a configuration problem. IIR there is a setting of which determines how many threads there can be. It may be that you are being overwhelmed, either by read traffic or a DenaialOfService attack. And the way you have Apache set up it is not throttling and so servicing network connections as fast as they come in, regardless of the rate-of-service. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org