Re: [opensuse-kernel] Problem with memory allocation / management on SLES 11 SP4
Dne čtvrtek 5. dubna 2018 11:13:31 CEST, Michal Hocko napsal(a):
On Wed 04-04-18 15:12:33, Lukáš Krejza wrote:
Dear list,
We are facing a strange behaviour regarding memory allocation for mysqld ( MySQL community release 5.7.21 ) on SLES 11 SP4.
We have a database server ( tested on both virtualized and bare metal hardware ) with 24GB of RAM and 8GB of swap. This mysqld is configured for our special needs ( not many connections, but there could be 1 - 2 transactions that needs big caches and 512MB tmp table in memory, ... ). We are hosting it on RHEL 7 and SLES 11 SP4. On RHEL, everything is working smoothly, but on SLES, we are getting constant out of memory messages from mysqld (nor from kernel) and mysqld crashes.
Are there any trace in the kernel log? Can you trace the allocation failre from mysql?
No, there is not a single message regarding memory allocation in both dmesg and /var/log/messages. Only in mysqld.log.
During investigation, i have noticed, that RHEL apparently allows to allocate more memory than actually available ( the mysqld uses roughly 20 - 40% of the allocated memory ). I first thought that this is a feature enabled in RHEL and disabled in SLES ( i have found out that the feature name is "memory overcommit" ), but it is set exactly the same on RHEL and on SLES, so the issue must be elsewhere (but anyway, i have tried to set vm.overcommit_ratio = 100, vm.overcommit_memory = 1, but without any change) .
Hmm overcommit_memory = 1 should rule the overcommit management completely out of picture.
Thanks for confirmation.
So, i would like to ask anyone with knowledge of linux kernel memory management and knowledge of SLES 11 kernel or mysqld, to give us any advice on why is SLES acting differently than RHEL.
Some debug info:
top, free, mysqld.log, /proc/meminfo and ulimit -a shortly before the crash: https://paste.opensuse.org/24626549
What are ulimits for the mysql process? /proc/pid/limits should tell you. Maybe there are different configurations depending on the detected OS?
https://paste.opensuse.org/56447994 There could be a problem. Why do i get "unlimited" from "ulimit -a" when switched to mysql user, but in /proc/PID/limits i get "Max address space 26853785600"? This has led me to following investigation: I started launching mysqld via various different ways (we are using openais clustering solution) 1) First i tried to launch mysqld from mysql user directly. The issue was gone and /proc/PID/limits showed Max address space: "unlimited". 2) Then i tried to launch it via a "service". The issue was again "fixed" with same symptoms as in 1) 3) Then i tried to launch it again by crm (openais). The /proc/PID/limits now showed Max address space: "26853785600"! So i revised our openais configuration and i noticed, that on RHEL, we are using "primitive service:mysql" which in turn invokes "service mysql start", but on SLES, we are using "primitive mysql", which launches mysqld (mysqld_safe) directly. This most probably (if my assumption is correct) led to the fact, that mysqld inherited limits from openais user "hacluster" - which are default. After changing "primitive mysql" to "primitive service:mysql" in openais configuration, everything started working correctly. Currently, mysqld is running with over 48 GB of RAM allocated: https://paste.opensuse.org/37596511 And everything runs smooth and fast. So, the main advice that has led me to the resolution was, that i should use "/proc/PID/limits" directly, instead of "ulimit -a" invoked as mysql user. Thank you very much!
It would be also useful to catch strace log of mysql at the time when the allocation fails.
Not needed anymore - issue is solved :).
participants (1)
-
Lukáš Krejza