[opensuse] Limiting memory usage via cgroups
Hi all, I'm facing a strange problem with cgroups in my openSUSE 13.2 fully updated with kernel 3.16.7-7-desktop. I run some experiment with a tool that sometimes takes a lot of memory; for my purposes, I consider the experiment failed if it takes too much memory or too much time. To limit its memory consumption, on oS 13.1 I used to create a cgroup via the commands: mkdir /tmp/experiment mount -t cgroup -o memory memory /tmp/experiment mkdir /tmp/experiment/test echo 8G > /tmp/experiment/test/memory.limit_in_bytes echo 8G > /tmp/experiment/test/memory.memsw.limit_in_bytes echo experiment_shell_PID > /tmp/experiment/test/tasks and then from the shell with PID experiment_shell_PID I can run my experiments on several files without problems. Now, with oS 13.2, the memory.memsw.* part is missing, so when the tool requires more than 8Gb of memory, instead of being killed as in oS 13.1, it starts to swap until it takes the whole swap (20Gb) or it is killed by timeout. As a workaround, I tried to disable the swap via "swapoff -a" but it works only for at most 15 minutes since then swap is re-enabled automatically. How can I get the oS 13.1 behavior back in oS 13.2? Best, Andrea -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
В Fri, 16 Jan 2015 10:41:00 +0800 Andrea Turrini <andrea.turrini@gmail.com> пишет:
Hi all,
I'm facing a strange problem with cgroups in my openSUSE 13.2 fully updated with kernel 3.16.7-7-desktop.
I run some experiment with a tool that sometimes takes a lot of memory; for my purposes, I consider the experiment failed if it takes too much memory or too much time. To limit its memory consumption, on oS 13.1 I used to create a cgroup via the commands:
mkdir /tmp/experiment mount -t cgroup -o memory memory /tmp/experiment mkdir /tmp/experiment/test echo 8G > /tmp/experiment/test/memory.limit_in_bytes echo 8G > /tmp/experiment/test/memory.memsw.limit_in_bytes echo experiment_shell_PID > /tmp/experiment/test/tasks
and then from the shell with PID experiment_shell_PID I can run my experiments on several files without problems.
Now, with oS 13.2, the memory.memsw.* part is missing, so when the tool requires more than 8Gb of memory, instead of being killed as in oS 13.1, it starts to swap until it takes the whole swap (20Gb) or it is killed by timeout. As a workaround, I tried to disable the swap via "swapoff -a" but it works only for at most 15 minutes since then swap is re-enabled automatically.
How can I get the oS 13.1 behavior back in oS 13.2?
As far as I know it was always disabled by default and had to be explicitly enabled * Wed Sep 21 2011 mhocko@suse.cz - Provide memory controller swap extension. Keep the feature disabled by default. Use swapaccount=1 kernel boot parameter for enabling it. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
2015-01-16 11:27 GMT+08:00 Andrei Borzenkov <arvidjaar@gmail.com>:
As far as I know it was always disabled by default and had to be explicitly enabled
* Wed Sep 21 2011 mhocko@suse.cz - Provide memory controller swap extension. Keep the feature disabled by default. Use swapaccount=1 kernel boot parameter for enabling it.
Thanks. But in oS 13.1 it is not disabled and swapaccount=1 is not in the kernel command line. Well, now I have added it to the kernel command line. Best, Andrea -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/16/2015 05:22 AM, Andrea Turrini wrote:
2015-01-16 11:27 GMT+08:00 Andrei Borzenkov <arvidjaar@gmail.com>:
As far as I know it was always disabled by default and had to be explicitly enabled
* Wed Sep 21 2011 mhocko@suse.cz - Provide memory controller swap extension. Keep the feature disabled by default. Use swapaccount=1 kernel boot parameter for enabling it.
Thanks. But in oS 13.1 it is not disabled and swapaccount=1 is not in the kernel command line. Well, now I have added it to the kernel command line.
Best, Andrea
This commit explains it: http://kernel.suse.com/cgit/kernel-source/commit/?id=4b52d8ebf0bfa4f668e3f51... "Disable CONFIG_MEMCG_SWAP_ENABLED because it got enabled by accident. The CONFIG_MEMCG_SWAP is enabled but the accounting has to be explicitly allowed by swap_account=1 kernel command line parameter" Check if the kernel parameter name is: swap_account=1 or swapaccount=1
On 01/16/2015 03:41 AM, Andrea Turrini wrote:
I run some experiment with a tool that sometimes takes a lot of memory; for my purposes, I consider the experiment failed if it takes too much memory or too much time. To limit its memory consumption, on oS 13.1 I used to create a cgroup via the commands:
I've never worked with cgoups, so I'd use good old ulimit to restrict memory usage. For time restrictions, I recommend the timeout(1) program from the coreutils package: $ info '(coreutils) timeout invocation' Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
2015-01-16 14:36 GMT+08:00 Bernhard Voelker <mail@bernhard-voelker.de>:
On 01/16/2015 03:41 AM, Andrea Turrini wrote: I've never worked with cgoups, so I'd use good old ulimit to restrict memory usage. For time restrictions, I recommend the timeout(1) program from the coreutils package:
The problem of ulimit is that it limits the virtual memory, not the resident one, and the variability of the ratio virtual/resident memory makes it difficult to find the right value to use with "ulimit -v" so to have the resident memory limited to 8Gb. For the time, I also use timeout. Best, Andrea
$ info '(coreutils) timeout invocation'
Have a nice day, Berny
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/16/2015 09:11 AM, Andrea Turrini wrote:
The problem of ulimit is that it limits the virtual memory, not the resident one, and the variability of the ratio virtual/resident memory makes it difficult to find the right value to use with "ulimit -v" so to have the resident memory limited to 8Gb.
Then maybe "prlimit --rss=..." of the util-linux package is your friend. ;-) Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
2015-01-16 17:54 GMT+08:00 Bernhard Voelker <mail@bernhard-voelker.de>:
On 01/16/2015 09:11 AM, Andrea Turrini wrote:
The problem of ulimit is that it limits the virtual memory, not the resident one, and the variability of the ratio virtual/resident memory makes it difficult to find the right value to use with "ulimit -v" so to have the resident memory limited to 8Gb.
Then maybe "prlimit --rss=..." of the util-linux package is your friend. ;-)
I didn't know this command; as an experiment, I tried "prlimit --rss=500 <tool>" but the RSS value in "top" reached 1Gb before I killed the process. Since the page size is 4K, it seems prlimit fails to force the value for RSS. OK, it is written that prlimit tries to modify the value, so I can not expect it succeeds each time... Moreover, I want to limit the overall memory used by the tool, not just the resident (that is why i needed the memsw part of the cgroups). Best, Andrea -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/16/2015 12:04 PM, Andrea Turrini wrote:
Moreover, I want to limit the overall memory used by the tool, not just the resident [...]
prlimit(1) is just a wrapper around prlimit(3) which in turn is just a newer syscall combining getrlimit(3) and setrlimit(3). Did you mean the AS field then? Example: "dd'ing by 700M chunks in memory" $ prlimit --as=800000000 dd if=/dev/zero of=/dev/null bs=700M count=2 2+0 records in 2+0 records out 1468006400 bytes (1.5 GB) copied, 0.44007 s, 3.3 GB/s vs. "dd'ing by 900M chunks in memory" $ prlimit --as=800000000 dd if=/dev/zero of=/dev/null bs=900M count=2 dd: memory exhausted by input buffer of size 943718400 bytes (900 MiB) Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
2015-01-16 19:18 GMT+08:00 Bernhard Voelker <mail@bernhard-voelker.de>:
prlimit(1) is just a wrapper around prlimit(3) which in turn is just a newer syscall combining getrlimit(3) and setrlimit(3).
Did you mean the AS field then?
Example: "dd'ing by 700M chunks in memory"
$ prlimit --as=800000000 dd if=/dev/zero of=/dev/null bs=700M count=2 2+0 records in 2+0 records out 1468006400 bytes (1.5 GB) copied, 0.44007 s, 3.3 GB/s
vs. "dd'ing by 900M chunks in memory"
$ prlimit --as=800000000 dd if=/dev/zero of=/dev/null bs=900M count=2 dd: memory exhausted by input buffer of size 943718400 bytes (900 MiB)
The AS field seems to affect the virtual memory, not the actual used one. As experiment, I tried to run a trivial test Java program: according to "ps uxa", resident memory is 25808 while virtual memory is 5587204. If I use "prlimit --as=1000000000 java test" (1GB), the JVM fails with the message: Error occurred during initialization of VM Could not reserve enough space for object heap Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. You can obtain a similar situation using "ulimit -v 1000000", except that then you can not increase it anymore in the same shell. I don't care whether the virtual memory is very high, it is the actually used one I care of. Best, Andrea -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Sun, 18 Jan 2015 02:58, Andrea Turrini <andrea.turrini@...> wrote:
2015-01-16 19:18 GMT+08:00 Bernhard Voelker <mail@bernhard-voelker.de>:
prlimit(1) is just a wrapper around prlimit(3) which in turn is just a newer syscall combining getrlimit(3) and setrlimit(3).
Did you mean the AS field then?
Example: "dd'ing by 700M chunks in memory"
$ prlimit --as=800000000 dd if=/dev/zero of=/dev/null bs=700M count=2 2+0 records in 2+0 records out 1468006400 bytes (1.5 GB) copied, 0.44007 s, 3.3 GB/s
vs. "dd'ing by 900M chunks in memory"
$ prlimit --as=800000000 dd if=/dev/zero of=/dev/null bs=900M count=2 dd: memory exhausted by input buffer of size 943718400 bytes (900 MiB)
The AS field seems to affect the virtual memory, not the actual used one. As experiment, I tried to run a trivial test Java program: according to "ps uxa", resident memory is 25808 while virtual memory is 5587204. If I use "prlimit --as=1000000000 java test" (1GB), the JVM fails with the message: Error occurred during initialization of VM Could not reserve enough space for object heap Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.
You can obtain a similar situation using "ulimit -v 1000000", except that then you can not increase it anymore in the same shell.
I don't care whether the virtual memory is very high, it is the actually used one I care of.
Best, Andrea
FYI: Remarkable is the notice in "man bashbuiltins" section ulimits: ... -m The maximum resident set size (many systems do not honor this limit) ... The man-page to prlimit does not give that hint. OTOH, the manpage of prlimit contains at least one error: "... separated by a semicolon (:), in ..." Uh, tell the other one, (:) is colon, (;) is semicolon. - Yamaban. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/18/2015 03:24 AM, Yamaban wrote:
On Sun, 18 Jan 2015 02:58, Andrea Turrini <andrea.turrini@...> wrote:
2015-01-16 19:18 GMT+08:00 Bernhard Voelker <mail@bernhard-voelker.de>:
Example: "dd'ing by 700M chunks in memory"
$ prlimit --as=800000000 dd if=/dev/zero of=/dev/null bs=700M count=2 2+0 records in 2+0 records out 1468006400 bytes (1.5 GB) copied, 0.44007 s, 3.3 GB/s
vs. "dd'ing by 900M chunks in memory"
$ prlimit --as=800000000 dd if=/dev/zero of=/dev/null bs=900M count=2 dd: memory exhausted by input buffer of size 943718400 bytes (900 MiB)
The AS field seems to affect the virtual memory, not the actual used one. As experiment, I tried to run a trivial test Java program: according to "ps uxa", resident memory is 25808 while virtual memory is 5587204. If I use "prlimit --as=1000000000 java test" (1GB), the JVM fails with the message: Error occurred during initialization of VM Could not reserve enough space for object heap Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.
The Java VM has it's own memory allocation system. If your program is a java program, then you should probably use the -Xmx... option.
OTOH, the manpage of prlimit contains at least one error:
"... separated by a semicolon (:), in ..."
Uh, tell the other one, (:) is colon, (;) is semicolon.
That typo's already been fixed upstream last July, and also on openSUSE-13.2 (util-linux-2.25.1). Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2015-01-18 14:57, Bernhard Voelker wrote:
On 01/18/2015 03:24 AM, Yamaban wrote:
OTOH, the manpage of prlimit contains at least one error:
"... separated by a semicolon (:), in ..."
Uh, tell the other one, (:) is colon, (;) is semicolon.
That typo's already been fixed upstream last July, and also on openSUSE-13.2 (util-linux-2.25.1).
What is it then the correct separator, ":" or ";"? -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On Sun, 18 Jan 2015 15:05, Carlos E. R. <robin.listas@...> wrote:
On 2015-01-18 14:57, Bernhard Voelker wrote:
On 01/18/2015 03:24 AM, Yamaban wrote:
OTOH, the manpage of prlimit contains at least one error:
"... separated by a semicolon (:), in ..."
Uh, tell the other one, (:) is colon, (;) is semicolon.
That typo's already been fixed upstream last July, and also on openSUSE-13.2 (util-linux-2.25.1).
What is it then the correct separator, ":" or ";"?
The colon (:) is the correct one, a semicolon, used unprotected (not quoted), would signal the end-of-command. And the bug was irritating because in the earliest incarnations of prlimit it was written correctly. - Yamaban -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2015-01-18 15:19, Yamaban wrote:
On Sun, 18 Jan 2015 15:05, Carlos E. R. <robin.listas@...> wrote:
What is it then the correct separator, ":" or ";"?
The colon (:) is the correct one, a semicolon, used unprotected (not quoted), would signal the end-of-command.
Ah, yes, of course. I forgot.
And the bug was irritating because in the earliest incarnations of prlimit it was written correctly.
Things happen... -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
2015-01-18 21:57 GMT+08:00 Bernhard Voelker <mail@bernhard-voelker.de>:
The Java VM has it's own memory allocation system. If your program is a java program, then you should probably use the -Xmx... option.
I know -Xmx (and -Xms); I used a java program to test whether AS limits the actual memory usage or the virtual one. Best, Andrea -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2015-01-16 09:11, Andrea Turrini wrote:
The problem of ulimit is that it limits the virtual memory, not the resident one,
Not true. ulimit is a bash internal command, so man ulimit explains it: -m The maximum resident set size (many systems do not honor this limit) -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On 01/16/2015 03:41 AM, Andrea Turrini wrote:
Hi all,
I'm facing a strange problem with cgroups in my openSUSE 13.2 fully updated with kernel 3.16.7-7-desktop.
I run some experiment with a tool that sometimes takes a lot of memory; for my purposes, I consider the experiment failed if it takes too much memory or too much time. To limit its memory consumption, on oS 13.1 I used to create a cgroup via the commands:
mkdir /tmp/experiment mount -t cgroup -o memory memory /tmp/experiment mkdir /tmp/experiment/test echo 8G > /tmp/experiment/test/memory.limit_in_bytes echo 8G > /tmp/experiment/test/memory.memsw.limit_in_bytes echo experiment_shell_PID > /tmp/experiment/test/tasks
and then from the shell with PID experiment_shell_PID I can run my experiments on several files without problems.
Now, with oS 13.2, the memory.memsw.* part is missing, so when the tool requires more than 8Gb of memory, instead of being killed as in oS 13.1, it starts to swap until it takes the whole swap (20Gb) or it is killed by timeout. As a workaround, I tried to disable the swap via "swapoff -a" but it works only for at most 15 minutes since then swap is re-enabled automatically.
How can I get the oS 13.1 behavior back in oS 13.2?
Best, Andrea
Hi, i think this is due to a kernel compile time parameter change. In 13.1: In 13.1: gunzip -c /proc/config.gz |grep -i memcg CONFIG_MEMCG=y CONFIG_MEMCG_SWAP=y CONFIG_MEMCG_SWAP_ENABLED=y # CONFIG_MEMCG_KMEM is not set in 13.2: gunzip -c /proc/config.gz |grep -i memcg CONFIG_MEMCG=y CONFIG_MEMCG_SWAP=y # CONFIG_MEMCG_SWAP_ENABLED is not set # CONFIG_MEMCG_KMEM is not set Either build a cusom kernel or open a bugzilla case to request the parameter set to "y" again.
2015-01-18 23:30 GMT+08:00 Florian Gleixner <flo@redflo.de>:
i think this is due to a kernel compile time parameter change. In 13.1:
In 13.1:
gunzip -c /proc/config.gz |grep -i memcg CONFIG_MEMCG=y CONFIG_MEMCG_SWAP=y CONFIG_MEMCG_SWAP_ENABLED=y # CONFIG_MEMCG_KMEM is not set
in 13.2:
gunzip -c /proc/config.gz |grep -i memcg CONFIG_MEMCG=y CONFIG_MEMCG_SWAP=y # CONFIG_MEMCG_SWAP_ENABLED is not set # CONFIG_MEMCG_KMEM is not set
Either build a cusom kernel or open a bugzilla case to request the parameter set to "y" again.
It is easier, faster, and more effective to just add the kernel command line option swapaccount=1 (as I did for oS 13.2 via YaST, obtaining the support for memory.memsw.*). Best, Andrea -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (6)
-
Andrea Turrini
-
Andrei Borzenkov
-
Bernhard Voelker
-
Carlos E. R.
-
Florian Gleixner
-
Yamaban