[opensuse-kernel] Bug in 11.4 kernel? - mmap() with MAP_NORESERVE failing [Chrome Google native client fails on Opensuse 11.4 64]
Hi (long post) I am not sure this is the right list; my post last week on Opensuse 64 bit forum did not yield comments so far. I am trying to resolve a failure of Google Native Client in Opensuse 64 11.4, discussed in Google Native Client forum: [1] https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM To repeat the issue in a nutshell:: --------------------------------------------- Google native client (Chrome) works on recent versions of at least Ubuntu [1], but fails on Opensuse 11.4 (with all latest updates up to Nov 4). This failure can be reproduced in chrome 14, 15, 16 (from http://dl.google.com/linux/chrome/rpm/stable/x86_64) and verified by loading [2] http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm... The problem / question for opensuse (kernel?) :: ------------------------------------------------------------------ There is a long discussion in the above thread, to get to the point quickly: The Google guys identified an issue with mmap() with MAP_NORESERVE (see below). They believe it may be a bug or a kernel configuration issue(?) A Chrome Nacl person suggest the following code should print "Success" but it fails in my testing: #include <stdio.h> #include <sys/mman.h> int main(void) { void *addr; printf("Hello world\nAllocating 29Gb...\n"); addr = mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); /* test 29 or other values */ if (MAP_FAILED == addr) { printf("FAILED\n"); } else { printf("Success: %p\n", addr); } return 0; } This prints FAILED on Opensuse 11.4 64 bit. I did some experiments. On my system: # cat /proc/meminfo MemTotal: 15948428 kB MemFree: 11270612 kB .... CommitLimit: 28945728 kB Committed_AS: 4918284 kB ...
From running the test program above, it looks like *CommitLimit* is clearly used as upper limit of mmap(MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE), *no matter what vm.overcommit_memory* flag is used.
In concrete terms: mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always FAILS with value of 29 or higher mmap((void *) NULL, 28 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always SUCCEEDS with value of 28 or lower No matter what setting of sysctl -w vm.overcommit_memory=0 # or 1 or 2 (This seems a Opensuse 11.4 bug in modes 0 and 1, as according to [2] anonymous private readonly should have 0 cost) Any comments or solutions or how to fix this? Thanks, Milan [1] http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt [2] http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting ======================= PS: I am attaching a few pieces of info about my system that may be relevant: The hardware is AMD 4 core AMD Athlon II X4 610e and has 16Gb (sixteen) of memory, running very little (just KDE desktop at this point) # swapon -s Filename Type Size Used Priority /dev/sda1 partition 20971516 0 -1 # cat /proc/sys/vm/overcommit_memory 0 # ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 123980 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) 13556224 open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 123980 virtual memory (kbytes, -v) 29536000 file locks (-x) unlimited # df Filesystem 1K-blocks Used Available Use% Mounted on rootfs 82567856 15558248 62815408 20% / devtmpfs 7934744 244 7934500 1% /dev tmpfs 7974212 1592 7972620 1% /dev/shm /dev/sda2 82567856 15558248 62815408 20% / /dev/sda3 377510440 90623252 267710692 26% /home -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/09/2011 01:50 PM, milan zimmermann wrote:
Hi (long post)
I am not sure this is the right list; my post last week on Opensuse 64 bit forum did not yield comments so far.
I used Chrome on openSUSE 11.4 64-bit successfully for a few weeks. It's quirks pushed me back to Firefox, though. That said, what does ulimit -a tell you? On my 11.4 64-bit system, I can't reproduce your failure. I see: jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux - -Jeff
I am trying to resolve a failure of Google Native Client in Opensuse 64 11.4, discussed in Google Native Client forum:
[1] https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM
To repeat the issue in a nutshell:: ---------------------------------------------
Google native client (Chrome) works on recent versions of at least Ubuntu [1], but fails on Opensuse 11.4 (with all latest updates up to Nov 4). This failure can be reproduced in chrome 14, 15, 16 (from http://dl.google.com/linux/chrome/rpm/stable/x86_64) and verified by loading
[2] http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm...
The problem / question for opensuse (kernel?) :: ------------------------------------------------------------------
There is a long discussion in the above thread, to get to the point quickly: The Google guys identified an issue with mmap() with MAP_NORESERVE (see below). They believe it may be a bug or a kernel configuration issue(?)
A Chrome Nacl person suggest the following code should print "Success" but it fails in my testing:
#include <stdio.h> #include <sys/mman.h>
int main(void) { void *addr;
printf("Hello world\nAllocating 29Gb...\n"); addr = mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); /* test 29 or other values */ if (MAP_FAILED == addr) { printf("FAILED\n"); } else { printf("Success: %p\n", addr); } return 0; }
This prints FAILED on Opensuse 11.4 64 bit.
I did some experiments. On my system:
# cat /proc/meminfo MemTotal: 15948428 kB MemFree: 11270612 kB .... CommitLimit: 28945728 kB Committed_AS: 4918284 kB ...
From running the test program above, it looks like *CommitLimit* is clearly used as upper limit of mmap(MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE), *no matter what vm.overcommit_memory* flag is used.
In concrete terms:
mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always FAILS with value of 29 or higher
mmap((void *) NULL, 28 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always SUCCEEDS with value of 28 or lower
No matter what setting of sysctl -w vm.overcommit_memory=0 # or 1 or 2
(This seems a Opensuse 11.4 bug in modes 0 and 1, as according to [2] anonymous private readonly should have 0 cost)
Any comments or solutions or how to fix this?
Thanks,
Milan
[1] http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt [2] http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
======================= PS: I am attaching a few pieces of info about my system that may be relevant:
The hardware is AMD 4 core AMD Athlon II X4 610e and has 16Gb (sixteen) of memory, running very little (just KDE desktop at this point)
# swapon -s Filename Type Size Used Priority /dev/sda1 partition 20971516 0 -1
# cat /proc/sys/vm/overcommit_memory 0
# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 123980 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) 13556224 open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 123980 virtual memory (kbytes, -v) 29536000 file locks (-x) unlimited
# df Filesystem 1K-blocks Used Available Use% Mounted on rootfs 82567856 15558248 62815408 20% / devtmpfs 7934744 244 7934500 1% /dev tmpfs 7974212 1592 7972620 1% /dev/shm /dev/sda2 82567856 15558248 62815408 20% / /dev/sda3 377510440 90623252 267710692 26% /home
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJOut47AAoJEB57S2MheeWyTvsQAJpl/5aykkblMM/qKLQgoe4v 9UXeC48ySqiyxq3w8ebO9FMjqLtzRKEhZuZRkPt4wXsLlgTIg1ofWRuNJ4lpVC0U XiU5lvvUf0PkIBr2q3RUOMEnWt+vIGuJ+NA8DYWZ2MZxrillQcE4XfVOcd8QeHlQ Kcadli8FV2e1q1agGZERoayNpy6sR6N/gULsnWdw+nEu5IOZI8465s31/BNW+rP7 jnj0J3YvirvIZuagCYIFxm+3g9bXpkALcln5uTF6O66q+84fispN9qzrWqOJSLhQ 2cFDu8CHRr9Wey6oIzqI0ojzNP5q6b224S+DLW07flFge5ZeUNbxK+Az/jO3tqEX XcU9vkDTp8raF3suRgd2Na4+LbM4uPjUa9VgysG6p559TnNzegEzMhruvcoy19tv tq0/lAeq7KKSu8pJSikMBQyFIazd0QSBxkzpFvv4fizqk0U4xf69DBD1SFFCz8Cd O8bZM55Kkq+LpduLsiXmfreXG98YP+Dg0I7YEIbBCT3538xPuncBPc+ve8jRJ0tA E6Iwen3yyoOLc+9lrtjmRi5AS0I+inFC0elEr9z8qtSQUYW2B3qwdv+oSM78302H n8+UhI+A+SdkFRixjmIn4ArgibKtgz9rJqLUn7jTktBT1l0jvCa5F4XhYccq4JRz rEk/O6ilHbkcXR1Y2OCj =DV0b -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
inline ... On Wed, Nov 9, 2011 at 2:10 PM, Jeff Mahoney <jeffm@suse.de> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/09/2011 01:50 PM, milan zimmermann wrote:
Hi (long post)
I am not sure this is the right list; my post last week on Opensuse 64 bit forum did not yield comments so far.
I used Chrome on openSUSE 11.4 64-bit successfully for a few weeks. It's quirks pushed me back to Firefox, though.
That said, what does ulimit -a tell you?
result is (slightly misformatted) at the end of my original email (also a few other commands) - it seems to show unlimited ... but here it's again: ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 123980 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) 13556224 open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 123980 virtual memory (kbytes, -v) 29536000 file locks (-x) unlimited let me know if i should run anything else, thanks
On my 11.4 64-bit system, I can't reproduce your failure. I see:
jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux
- -Jeff
I am trying to resolve a failure of Google Native Client in Opensuse 64 11.4, discussed in Google Native Client forum:
[1] https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM
To repeat the issue in a nutshell:: ---------------------------------------------
Google native client (Chrome) works on recent versions of at least Ubuntu [1], but fails on Opensuse 11.4 (with all latest updates up to Nov 4). This failure can be reproduced in chrome 14, 15, 16 (from http://dl.google.com/linux/chrome/rpm/stable/x86_64) and verified by loading
[2] http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm...
The problem / question for opensuse (kernel?) :: ------------------------------------------------------------------
There is a long discussion in the above thread, to get to the point quickly: The Google guys identified an issue with mmap() with MAP_NORESERVE (see below). They believe it may be a bug or a kernel configuration issue(?)
A Chrome Nacl person suggest the following code should print "Success" but it fails in my testing:
#include <stdio.h> #include <sys/mman.h>
int main(void) { void *addr;
printf("Hello world\nAllocating 29Gb...\n"); addr = mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); /* test 29 or other values */ if (MAP_FAILED == addr) { printf("FAILED\n"); } else { printf("Success: %p\n", addr); } return 0; }
This prints FAILED on Opensuse 11.4 64 bit.
I did some experiments. On my system:
# cat /proc/meminfo MemTotal: 15948428 kB MemFree: 11270612 kB .... CommitLimit: 28945728 kB Committed_AS: 4918284 kB ...
From running the test program above, it looks like *CommitLimit* is clearly used as upper limit of mmap(MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE), *no matter what vm.overcommit_memory* flag is used.
In concrete terms:
mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always FAILS with value of 29 or higher
mmap((void *) NULL, 28 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always SUCCEEDS with value of 28 or lower
No matter what setting of sysctl -w vm.overcommit_memory=0 # or 1 or 2
(This seems a Opensuse 11.4 bug in modes 0 and 1, as according to [2] anonymous private readonly should have 0 cost)
Any comments or solutions or how to fix this?
Thanks,
Milan
[1] http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt [2] http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
======================= PS: I am attaching a few pieces of info about my system that may be relevant:
The hardware is AMD 4 core AMD Athlon II X4 610e and has 16Gb (sixteen) of memory, running very little (just KDE desktop at this point)
# swapon -s Filename Type Size Used Priority /dev/sda1 partition 20971516 0 -1
# cat /proc/sys/vm/overcommit_memory 0
# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 123980 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) 13556224 open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 123980 virtual memory (kbytes, -v) 29536000 file locks (-x) unlimited
# df Filesystem 1K-blocks Used Available Use% Mounted on rootfs 82567856 15558248 62815408 20% / devtmpfs 7934744 244 7934500 1% /dev tmpfs 7974212 1592 7972620 1% /dev/shm /dev/sda2 82567856 15558248 62815408 20% / /dev/sda3 377510440 90623252 267710692 26% /home
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJOut47AAoJEB57S2MheeWyTvsQAJpl/5aykkblMM/qKLQgoe4v 9UXeC48ySqiyxq3w8ebO9FMjqLtzRKEhZuZRkPt4wXsLlgTIg1ofWRuNJ4lpVC0U XiU5lvvUf0PkIBr2q3RUOMEnWt+vIGuJ+NA8DYWZ2MZxrillQcE4XfVOcd8QeHlQ Kcadli8FV2e1q1agGZERoayNpy6sR6N/gULsnWdw+nEu5IOZI8465s31/BNW+rP7 jnj0J3YvirvIZuagCYIFxm+3g9bXpkALcln5uTF6O66q+84fispN9qzrWqOJSLhQ 2cFDu8CHRr9Wey6oIzqI0ojzNP5q6b224S+DLW07flFge5ZeUNbxK+Az/jO3tqEX XcU9vkDTp8raF3suRgd2Na4+LbM4uPjUa9VgysG6p559TnNzegEzMhruvcoy19tv tq0/lAeq7KKSu8pJSikMBQyFIazd0QSBxkzpFvv4fizqk0U4xf69DBD1SFFCz8Cd O8bZM55Kkq+LpduLsiXmfreXG98YP+Dg0I7YEIbBCT3538xPuncBPc+ve8jRJ0tA E6Iwen3yyoOLc+9lrtjmRi5AS0I+inFC0elEr9z8qtSQUYW2B3qwdv+oSM78302H n8+UhI+A+SdkFRixjmIn4ArgibKtgz9rJqLUn7jTktBT1l0jvCa5F4XhYccq4JRz rEk/O6ilHbkcXR1Y2OCj =DV0b -----END PGP SIGNATURE-----
-- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/09/2011 03:21 PM, milan zimmermann wrote:
inline ... On Wed, Nov 9, 2011 at 2:10 PM, Jeff Mahoney <jeffm@suse.de> wrote: On 11/09/2011 01:50 PM, milan zimmermann wrote:
Hi (long post)
I am not sure this is the right list; my post last week on Opensuse 64 bit forum did not yield comments so far.
I used Chrome on openSUSE 11.4 64-bit successfully for a few weeks. It's quirks pushed me back to Firefox, though.
That said, what does ulimit -a tell you?
result is (slightly misformatted) at the end of my original email (also a few other commands) - it seems to show unlimited ... but here it's again:
Ah. You're right. Sorry about that.
virtual memory (kbytes, -v) 29536000
... but here is your problem. - -Jeff
let me know if i should run anything else, thanks
On my 11.4 64-bit system, I can't reproduce your failure. I see:
jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux
-Jeff
I am trying to resolve a failure of Google Native Client in Opensuse 64 11.4, discussed in Google Native Client forum:
[1] https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM
To repeat the issue in a nutshell::
---------------------------------------------
Google native client (Chrome) works on recent versions of at least Ubuntu [1], but fails on Opensuse 11.4 (with all latest updates up to Nov 4). This failure can be reproduced in chrome 14, 15, 16 (from http://dl.google.com/linux/chrome/rpm/stable/x86_64) and verified by loading
[2] http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm...
The problem / question for opensuse (kernel?) ::
------------------------------------------------------------------
There is a long discussion in the above thread, to get to the
point quickly: The Google guys identified an issue with mmap() with MAP_NORESERVE (see below). They believe it may be a bug or a kernel configuration issue(?)
A Chrome Nacl person suggest the following code should print "Success" but it fails in my testing:
#include <stdio.h> #include <sys/mman.h>
int main(void) { void *addr;
printf("Hello world\nAllocating 29Gb...\n"); addr = mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); /* test 29 or other values */ if (MAP_FAILED == addr) { printf("FAILED\n"); } else { printf("Success: %p\n", addr); } return 0; }
This prints FAILED on Opensuse 11.4 64 bit.
I did some experiments. On my system:
# cat /proc/meminfo MemTotal: 15948428 kB MemFree: 11270612 kB .... CommitLimit: 28945728 kB Committed_AS: 4918284 kB ...
From running the test program above, it looks like *CommitLimit* is clearly used as upper limit of mmap(MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE), *no matter what vm.overcommit_memory* flag is used.
In concrete terms:
mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always FAILS with value of 29 or higher
mmap((void *) NULL, 28 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always SUCCEEDS with value of 28 or lower
No matter what setting of sysctl -w vm.overcommit_memory=0 # or 1 or 2
(This seems a Opensuse 11.4 bug in modes 0 and 1, as according to [2] anonymous private readonly should have 0 cost)
Any comments or solutions or how to fix this?
Thanks,
Milan
[1] http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt
[2]
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
======================= PS: I am attaching a few pieces of info
about my system that may be relevant:
The hardware is AMD 4 core AMD Athlon II X4 610e and has 16Gb (sixteen) of memory, running very little (just KDE desktop at this point)
# swapon -s Filename Type Size Used Priority /dev/sda1 partition 20971516 0 -1
# cat /proc/sys/vm/overcommit_memory 0
# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 123980 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) 13556224 open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 123980 virtual memory (kbytes, -v) 29536000 file locks (-x) unlimited
# df Filesystem 1K-blocks Used Available Use% Mounted on rootfs 82567856 15558248 62815408 20% / devtmpfs 7934744 244 7934500 1% /dev tmpfs 7974212 1592 7972620 1% /dev/shm /dev/sda2 82567856 15558248 62815408 20% / /dev/sda3 377510440 90623252 267710692 26% /home
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJOuvN7AAoJEB57S2MheeWytCEP/3j1JmG6BspjeGdMPuPcUnJ2 yDp1j8FwCrEC8dmK3f6c0f5CG7TBRKrykcpQW9iC4uGtnefExbbHWxWJWkIoln6s V6WxIkcCNsTZylVKKlcL7FT4NLNV0EuzzRjPsDx/NiSdeLlmpnaslwuOfJCvUTso zn5sXJuB4GjSyjQHl2ucQQY9b99KamO0JO1qllqTv8hQyfcSM98fM/qq80/gHGMN oIGMPLkannjj4wKmgt/DwfSBZxy/juTlK7brwtkMR4P/FQPp6AMXvuh4i5B9+p5H xRC/6ge7XqkkKF8SERxNHZy6DO883btYwZfQQ2YLxnrAShES1++dPTdsL8M6p85F MEzUuLN9lCrT9O+6Cwb2MWcNLv4hxErSht+jsWd5dg9I4FhM7PPzsxQE7w+J3OPr bQGEhymG6Tq2W4Pnbvr+A00GTx4T0rfzI0Oj1yz0KY2lyxdXsAnkFYTotkCL5fP/ Hhd+XHXwRcuVK0P2mhG51gUyA6AGCgtZvE94/03aZ0CD5ko1AWTcVyOSNSCZAqZ2 nzn49wbd2zdd6AXndUWsd06on3TdqvahL5fxCuxsnvhOR4TDwYUFvz3s9fgCS/Bm XPXWhuWi0BH+IuTR6iaKq4CyXrVFIRaxqCyCveV0pxkh5RzEPOp35QRbdKzsRFyi DslaT+OpuBlfcKext2W8 =Psrl -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
On Wed, Nov 9, 2011 at 3:41 PM, Jeff Mahoney <jeffm@suse.de> wrote:
-----BEGIN PGP SIGNED MESSAGE----- ....
virtual memory (kbytes, -v) 29536000
... but here is your problem.
I disagree. With the switches I used, and with sysctl -w vm.overcommit_memory=0 # or 1 according to the link I posted http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting the cost of allocation with these parameters should be 0, and the call mmap((void *) NULL, someSize * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); # someSize is integer with number of Gigs should succeed no matter what virtual memory, and no matter what size I am allocating. So far I have no other explanation then this is a bug (anyone agrees?). On the chrome list, people indicated same issue was fixed in Ubuntu lately (mmap 84Gb succeds with way less virtual memory). I only have 4 boxes with Opensuse ... will also try Opensuse 12.1 and report thanks, milan
- -Jeff
let me know if i should run anything else, thanks
On my 11.4 64-bit system, I can't reproduce your failure. I see:
jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux
-Jeff
I am trying to resolve a failure of Google Native Client in Opensuse 64 11.4, discussed in Google Native Client forum:
[1] https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM
To repeat the issue in a nutshell::
---------------------------------------------
Google native client (Chrome) works on recent versions of at least Ubuntu [1], but fails on Opensuse 11.4 (with all latest updates up to Nov 4). This failure can be reproduced in chrome 14, 15, 16 (from http://dl.google.com/linux/chrome/rpm/stable/x86_64) and verified by loading
[2] http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm...
The problem / question for opensuse (kernel?) ::
------------------------------------------------------------------
There is a long discussion in the above thread, to get to the
point quickly: The Google guys identified an issue with mmap() with MAP_NORESERVE (see below). They believe it may be a bug or a kernel configuration issue(?)
A Chrome Nacl person suggest the following code should print "Success" but it fails in my testing:
#include <stdio.h> #include <sys/mman.h>
int main(void) { void *addr;
printf("Hello world\nAllocating 29Gb...\n"); addr = mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); /* test 29 or other values */ if (MAP_FAILED == addr) { printf("FAILED\n"); } else { printf("Success: %p\n", addr); } return 0; }
This prints FAILED on Opensuse 11.4 64 bit.
I did some experiments. On my system:
# cat /proc/meminfo MemTotal: 15948428 kB MemFree: 11270612 kB .... CommitLimit: 28945728 kB Committed_AS: 4918284 kB ...
From running the test program above, it looks like *CommitLimit* is clearly used as upper limit of mmap(MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE), *no matter what vm.overcommit_memory* flag is used.
In concrete terms:
mmap((void *) NULL, 29 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always FAILS with value of 29 or higher
mmap((void *) NULL, 28 * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); // always SUCCEEDS with value of 28 or lower
No matter what setting of sysctl -w vm.overcommit_memory=0 # or 1 or 2
(This seems a Opensuse 11.4 bug in modes 0 and 1, as according to [2] anonymous private readonly should have 0 cost)
Any comments or solutions or how to fix this?
Thanks,
Milan
[1] http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt
[2]
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
======================= PS: I am attaching a few pieces of info
about my system that may be relevant:
The hardware is AMD 4 core AMD Athlon II X4 610e and has 16Gb (sixteen) of memory, running very little (just KDE desktop at this point)
# swapon -s Filename Type Size Used Priority /dev/sda1 partition 20971516 0 -1
# cat /proc/sys/vm/overcommit_memory 0
# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 123980 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) 13556224 open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 123980 virtual memory (kbytes, -v) 29536000 file locks (-x) unlimited
# df Filesystem 1K-blocks Used Available Use% Mounted on rootfs 82567856 15558248 62815408 20% / devtmpfs 7934744 244 7934500 1% /dev tmpfs 7974212 1592 7972620 1% /dev/shm /dev/sda2 82567856 15558248 62815408 20% / /dev/sda3 377510440 90623252 267710692 26% /home
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJOuvN7AAoJEB57S2MheeWytCEP/3j1JmG6BspjeGdMPuPcUnJ2 yDp1j8FwCrEC8dmK3f6c0f5CG7TBRKrykcpQW9iC4uGtnefExbbHWxWJWkIoln6s V6WxIkcCNsTZylVKKlcL7FT4NLNV0EuzzRjPsDx/NiSdeLlmpnaslwuOfJCvUTso zn5sXJuB4GjSyjQHl2ucQQY9b99KamO0JO1qllqTv8hQyfcSM98fM/qq80/gHGMN oIGMPLkannjj4wKmgt/DwfSBZxy/juTlK7brwtkMR4P/FQPp6AMXvuh4i5B9+p5H xRC/6ge7XqkkKF8SERxNHZy6DO883btYwZfQQ2YLxnrAShES1++dPTdsL8M6p85F MEzUuLN9lCrT9O+6Cwb2MWcNLv4hxErSht+jsWd5dg9I4FhM7PPzsxQE7w+J3OPr bQGEhymG6Tq2W4Pnbvr+A00GTx4T0rfzI0Oj1yz0KY2lyxdXsAnkFYTotkCL5fP/ Hhd+XHXwRcuVK0P2mhG51gUyA6AGCgtZvE94/03aZ0CD5ko1AWTcVyOSNSCZAqZ2 nzn49wbd2zdd6AXndUWsd06on3TdqvahL5fxCuxsnvhOR4TDwYUFvz3s9fgCS/Bm XPXWhuWi0BH+IuTR6iaKq4CyXrVFIRaxqCyCveV0pxkh5RzEPOp35QRbdKzsRFyi DslaT+OpuBlfcKext2W8 =Psrl -----END PGP SIGNATURE-----
-- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/10/2011 11:06 AM, milan zimmermann wrote:
On Wed, Nov 9, 2011 at 3:41 PM, Jeff Mahoney <jeffm@suse.de> wrote:
virtual memory (kbytes, -v) 29536000
... but here is your problem.
I disagree. With the switches I used, and with
Then test it. On my system, I see: jeffm@jetfire:~> ulimit -v unlimited jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... Success: 0x7fc7732d6000 jeffm@jetfire:~> ulimit -v 28000000 jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... FAILED
sysctl -w vm.overcommit_memory=0 # or 1
according to the link I posted
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
the cost of allocation with these parameters should be 0, and the call
mmap((void *) NULL, someSize * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); # someSize is integer with number of Gigs
should succeed no matter what virtual memory, and no matter what size I am allocating.
As far as the kernel generally is concerned, yes. You're hitting user resource limits, not a kernel out-of-memory condition. If you had overcommit disabled, then you might run into that issue.
So far I have no other explanation then this is a bug (anyone agrees?). On the chrome list, people indicated same issue was fixed in Ubuntu lately (mmap 84Gb succeds with way less virtual memory). I only have 4 boxes with Opensuse ... will also try Opensuse 12.1 and report
I blame Windows for perpetuating this myth that "virtual memory" = "real memory + swap" since that's not at all what it means. Virtual memory is the address space. You're allocating part of the address space but you're not using it. - -Jeff
let me know if i should run anything else, thanks
On my 11.4 64-bit system, I can't reproduce your failure. I see:
jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux
-Jeff
> I am trying to resolve a failure of Google Native > Client in Opensuse 64 11.4, discussed in Google Native > Client forum: > > [1] > https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM > > >
> To repeat the issue in a nutshell:: > --------------------------------------------- > > Google native client (Chrome) works on recent versions > of at least Ubuntu [1], but fails on Opensuse 11.4 > (with all latest updates up to Nov 4). This failure can > be reproduced in chrome 14, 15, 16 (from > http://dl.google.com/linux/chrome/rpm/stable/x86_64) > and verified by loading > > [2] > http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm... > > >
> The problem / question for opensuse (kernel?) :: > ------------------------------------------------------------------ > > >
> There is a long discussion in the above thread, to get to the > point quickly: The Google guys identified an issue > with mmap() with MAP_NORESERVE (see below). They > believe it may be a bug or a kernel configuration > issue(?) > > A Chrome Nacl person suggest the following code should > print "Success" but it fails in my testing: > > #include <stdio.h> #include <sys/mman.h> > > int main(void) { void *addr; > > printf("Hello world\nAllocating 29Gb...\n"); addr = > mmap((void *) NULL, 29 * (((size_t) 1) << 30), > PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, > -1, 0); /* test 29 or other values */ if (MAP_FAILED == > addr) { printf("FAILED\n"); } else { printf("Success: > %p\n", addr); } return 0; } > > This prints FAILED on Opensuse 11.4 64 bit. > > I did some experiments. On my system: > > # cat /proc/meminfo MemTotal: 15948428 kB > MemFree: 11270612 kB .... CommitLimit: 28945728 kB > Committed_AS: 4918284 kB ... > > From running the test program above, it looks like > *CommitLimit* is clearly used as upper limit of > mmap(MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE), *no > matter what vm.overcommit_memory* flag is used. > > In concrete terms: > > mmap((void *) NULL, 29 * (((size_t) 1) << 30), > PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, > -1, 0); // always FAILS with value of 29 or higher > > mmap((void *) NULL, 28 * (((size_t) 1) << 30), > PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, > -1, 0); // always SUCCEEDS with value of 28 or lower > > No matter what setting of sysctl -w > vm.overcommit_memory=0 # or 1 or 2 > > (This seems a Opensuse 11.4 bug in modes 0 and 1, as > according to [2] anonymous private readonly should have > 0 cost) > > Any comments or solutions or how to fix this? > > Thanks, > > Milan > > [1] > http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt > >
> [2] > http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting > > > > >
> ======================= PS: I am attaching a few pieces of info > about my system that may be relevant: > > The hardware is AMD 4 core AMD Athlon II X4 610e and > has 16Gb (sixteen) of memory, running very little (just > KDE desktop at this point) > > # swapon -s Filename > Type Size Used Priority /dev/sda1 partition > 20971516 0 -1 > > # cat /proc/sys/vm/overcommit_memory 0 > > # ulimit -a core file size (blocks, -c) 0 data > seg size (kbytes, -d) unlimited scheduling priority > (-e) 0 file size (blocks, -f) unlimited > pending signals (-i) 123980 max locked memory > (kbytes, -l) 64 max memory size (kbytes, -m) 13556224 > open files (-n) 1024 pipe size (512 bytes, > -p) 8 POSIX message queues (bytes, -q) 819200 real-time > priority (-r) 0 stack size (kbytes, -s) > 8192 cpu time (seconds, -t) unlimited max user > processes (-u) 123980 virtual memory > (kbytes, -v) 29536000 file locks (-x) unlimited > > # df Filesystem 1K-blocks Used Available > Use% Mounted on rootfs 82567856 > 15558248 62815408 20% / devtmpfs 7934744 > 244 7934500 1% /dev tmpfs 7974212 > 1592 7972620 1% /dev/shm /dev/sda2 > 82567856 15558248 62815408 20% / /dev/sda3 377510440 > 90623252 267710692 26% /home
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJOu/iYAAoJEB57S2MheeWywEAP/ROIa5GQo+wasRACfKNH6xJd /4E0MBxWbKBduorirabz786vIuZZ7xKzX0ZzakBh3xO0896SFTbzukJRPGWO+3tG hf1xrS+BiRQywIgS+d4xDLspLAgiXh2RlH5ANPaeE+jEy3+lCvDM/aH0UiOcGkGd nt9fQemP3QmoUSAK1xl+tXsHvK4yP5g/If/JDFiVhmNpSTRA3w8qqwL6PCDLnzH9 dL0XqN8e72a+xXCvkgPDLHFevVs2xkjSwJmtDpDnO9z1Bxj8lyWhpjl03mUu9jh9 0gxbT/soEr8YqBT9OtmX/o3OHZ3Ex7jHweEv10DmNg2H8z7jf8LMm/mTor4OXAQH PSoEwVfauBLhNMwWc+WYmBK++Pq79DIHR0jCe7BgKshr9QyL9fGzRqxPG8lKiXyZ oraBV7Ts9GLiWrTc4r3m3/I/tSAJJjYsqeJTwdiyTPNXsrDWowl5w3gjJRYmig0q ZIdFvBw3BoxvqkB3P4EwaC5HY41kqMquPmc1onvFW05yRxXAMahKsFu0Xn4/RG7R 85rf60IvD+vnXxw+y0Vu7qF4WdYV3sA/oqFrHQA4Q3bVP+6BdLwoVRVIjikjRwyL BURYGjFPcdPHoZs2rvaTIM9ssZXjHZyuvj/qTXQ5ML4li0CmWBxV3R0BC72vO/w2 foqHlyXZ0jrmJCcQR8dv =iqrE -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
On Thu, Nov 10, 2011 at 10:15 AM, Jeff Mahoney <jeffm@suse.de> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/10/2011 11:06 AM, milan zimmermann wrote:
On Wed, Nov 9, 2011 at 3:41 PM, Jeff Mahoney <jeffm@suse.de> wrote:
virtual memory (kbytes, -v) 29536000
... but here is your problem.
I disagree. With the switches I used, and with
Then test it. On my system, I see:
jeffm@jetfire:~> ulimit -v unlimited jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... Success: 0x7fc7732d6000
jeffm@jetfire:~> ulimit -v 28000000 jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... FAILED
You are right, that is how it works, I agree (my "disagree" was too harsh there :) ). I should have said I am not sure with these particular flags ulimit should matter, as there should be no memory allocated (reserved). But I am only basic that on half-understanding this: http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting (I think you are describing the difference below, also a question there)
sysctl -w vm.overcommit_memory=0 # or 1
according to the link I posted
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
the cost of allocation with these parameters should be 0, and the call
mmap((void *) NULL, someSize * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); # someSize is integer with number of Gigs
should succeed no matter what virtual memory, and no matter what size I am allocating.
As far as the kernel generally is concerned, yes. You're hitting user resource limits, not a kernel out-of-memory condition. If you had overcommit disabled, then you might run into that issue.
So it seems that the user resource limit is checked first. Let me ask a speculative question (but I would appreciate a comment as it may help getting Nacl working on Linux, it seems their code is non-reserving 84Gb this way) :) So the question: Do you think that on any Linux, with ulimit virtual memory set to X Gb, this call will always fail: mmap((void *) NULL, (X+1) * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); Thanks (one more comment below)
So far I have no other explanation then this is a bug (anyone agrees?). On the chrome list, people indicated same issue was fixed in Ubuntu lately (mmap 84Gb succeds with way less virtual memory). I only have 4 boxes with Opensuse ... will also try Opensuse 12.1 and report
I blame Windows for perpetuating this myth that "virtual memory" = "real memory + swap" since that's not at all what it means. Virtual memory is the address space. You're allocating part of the address space but you're not using it.
That is why I thought the user limit should be checked at the point I actually use it (not during MAP_NORESERVE), but certainly have little support for that :) Thanks for your help and comments, milan
- -Jeff
let me know if i should run anything else, thanks
On my 11.4 64-bit system, I can't reproduce your failure. I see:
jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux
-Jeff
>> I am trying to resolve a failure of Google Native >> Client in Opensuse 64 11.4, discussed in Google Native >> Client forum: >> >> [1] >> https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM >> >> >>
>> To repeat the issue in a nutshell:: >> --------------------------------------------- >> >> Google native client (Chrome) works on recent versions >> of at least Ubuntu [1], but fails on Opensuse 11.4 >> (with all latest updates up to Nov 4). This failure can >> be reproduced in chrome 14, 15, 16 (from >> http://dl.google.com/linux/chrome/rpm/stable/x86_64) >> and verified by loading >> >> [2] >> http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm... >> >> >>
>> The problem / question for opensuse (kernel?) :: >> ------------------------------------------------------------------ >> >> >>
>> There is a long discussion in the above thread, to get to the >> point quickly: The Google guys identified an issue >> with mmap() with MAP_NORESERVE (see below). They >> believe it may be a bug or a kernel configuration >> issue(?) >> >> A Chrome Nacl person suggest the following code should >> print "Success" but it fails in my testing: >> >> #include <stdio.h> #include <sys/mman.h> >> >> int main(void) { void *addr; >> >> printf("Hello world\nAllocating 29Gb...\n"); addr = >> mmap((void *) NULL, 29 * (((size_t) 1) << 30), >> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, >> -1, 0); /* test 29 or other values */ if (MAP_FAILED == >> addr) { printf("FAILED\n"); } else { printf("Success: >> %p\n", addr); } return 0; } >> >> This prints FAILED on Opensuse 11.4 64 bit. >> >> I did some experiments. On my system: >> >> # cat /proc/meminfo MemTotal: 15948428 kB >> MemFree: 11270612 kB .... CommitLimit: 28945728 kB >> Committed_AS: 4918284 kB ... >> >> From running the test program above, it looks like >> *CommitLimit* is clearly used as upper limit of >> mmap(MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE), *no >> matter what vm.overcommit_memory* flag is used. >> >> In concrete terms: >> >> mmap((void *) NULL, 29 * (((size_t) 1) << 30), >> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, >> -1, 0); // always FAILS with value of 29 or higher >> >> mmap((void *) NULL, 28 * (((size_t) 1) << 30), >> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, >> -1, 0); // always SUCCEEDS with value of 28 or lower >> >> No matter what setting of sysctl -w >> vm.overcommit_memory=0 # or 1 or 2 >> >> (This seems a Opensuse 11.4 bug in modes 0 and 1, as >> according to [2] anonymous private readonly should have >> 0 cost) >> >> Any comments or solutions or how to fix this? >> >> Thanks, >> >> Milan >> >> [1] >> http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt >> >>
>> [2] >> http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting >> >> >> >> >>
>> ======================= PS: I am attaching a few pieces of info >> about my system that may be relevant: >> >> The hardware is AMD 4 core AMD Athlon II X4 610e and >> has 16Gb (sixteen) of memory, running very little (just >> KDE desktop at this point) >> >> # swapon -s Filename >> Type Size Used Priority /dev/sda1 partition >> 20971516 0 -1 >> >> # cat /proc/sys/vm/overcommit_memory 0 >> >> # ulimit -a core file size (blocks, -c) 0 data >> seg size (kbytes, -d) unlimited scheduling priority >> (-e) 0 file size (blocks, -f) unlimited >> pending signals (-i) 123980 max locked memory >> (kbytes, -l) 64 max memory size (kbytes, -m) 13556224 >> open files (-n) 1024 pipe size (512 bytes, >> -p) 8 POSIX message queues (bytes, -q) 819200 real-time >> priority (-r) 0 stack size (kbytes, -s) >> 8192 cpu time (seconds, -t) unlimited max user >> processes (-u) 123980 virtual memory >> (kbytes, -v) 29536000 file locks (-x) unlimited >> >> # df Filesystem 1K-blocks Used Available >> Use% Mounted on rootfs 82567856 >> 15558248 62815408 20% / devtmpfs 7934744 >> 244 7934500 1% /dev tmpfs 7974212 >> 1592 7972620 1% /dev/shm /dev/sda2 >> 82567856 15558248 62815408 20% / /dev/sda3 377510440 >> 90623252 267710692 26% /home
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJOu/iYAAoJEB57S2MheeWywEAP/ROIa5GQo+wasRACfKNH6xJd /4E0MBxWbKBduorirabz786vIuZZ7xKzX0ZzakBh3xO0896SFTbzukJRPGWO+3tG hf1xrS+BiRQywIgS+d4xDLspLAgiXh2RlH5ANPaeE+jEy3+lCvDM/aH0UiOcGkGd nt9fQemP3QmoUSAK1xl+tXsHvK4yP5g/If/JDFiVhmNpSTRA3w8qqwL6PCDLnzH9 dL0XqN8e72a+xXCvkgPDLHFevVs2xkjSwJmtDpDnO9z1Bxj8lyWhpjl03mUu9jh9 0gxbT/soEr8YqBT9OtmX/o3OHZ3Ex7jHweEv10DmNg2H8z7jf8LMm/mTor4OXAQH PSoEwVfauBLhNMwWc+WYmBK++Pq79DIHR0jCe7BgKshr9QyL9fGzRqxPG8lKiXyZ oraBV7Ts9GLiWrTc4r3m3/I/tSAJJjYsqeJTwdiyTPNXsrDWowl5w3gjJRYmig0q ZIdFvBw3BoxvqkB3P4EwaC5HY41kqMquPmc1onvFW05yRxXAMahKsFu0Xn4/RG7R 85rf60IvD+vnXxw+y0Vu7qF4WdYV3sA/oqFrHQA4Q3bVP+6BdLwoVRVIjikjRwyL BURYGjFPcdPHoZs2rvaTIM9ssZXjHZyuvj/qTXQ5ML4li0CmWBxV3R0BC72vO/w2 foqHlyXZ0jrmJCcQR8dv =iqrE -----END PGP SIGNATURE-----
-- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/10/2011 02:43 PM, milan zimmermann wrote:
On Thu, Nov 10, 2011 at 10:15 AM, Jeff Mahoney <jeffm@suse.de> wrote: On 11/10/2011 11:06 AM, milan zimmermann wrote:
On Wed, Nov 9, 2011 at 3:41 PM, Jeff Mahoney <jeffm@suse.de> wrote:
>> virtual memory (kbytes, -v) 29536000
... but here is your problem.
I disagree. With the switches I used, and with
Then test it. On my system, I see:
jeffm@jetfire:~> ulimit -v unlimited jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... Success: 0x7fc7732d6000
jeffm@jetfire:~> ulimit -v 28000000 jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... FAILED
You are right, that is how it works, I agree (my "disagree" was too harsh there :) ).
I should have said I am not sure with these particular flags ulimit should matter, as there should be no memory allocated (reserved). But I am only basic that on half-understanding this:
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
(I think you are describing the difference below, also a question there)
sysctl -w vm.overcommit_memory=0 # or 1
according to the link I posted
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
I am, but there's another key difference here. There's a "max memory size" ulimit that controls actual allocations. The "virtual memory" one controls how much of the address space you can use. They're two separate things. Here's a quick run down. Overcommit covers memory use that is backed by memory itself plus any swap space you have. mmap can be used to assign portions of the address space to objects and not all of them are backed by memory+swap. For example, your program executable itself, any libraries you load, and any files you mmap() read-only are all backed by the file on-disk unless they're modified by private mappings. The kernel knows that it can drop those pages and read them back in from the file, similar to how it can read swapped pages back from the swap space. Take a look at /proc/<pid>/maps to see what I mean. For your test case, I see this: jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f0ad3de2000 pid=3485 jeffm@jetfire:~> cat /proc/3485/maps 00400000-00401000 r-xp 00000000 fd:03 131888 /home/jeffm/a.out 00600000-00601000 r--p 00000000 fd:03 131888 /home/jeffm/a.out 00601000-00602000 rw-p 00001000 fd:03 131888 /home/jeffm/a.out 7f0ad3de2000-7f1213de2000 ---p 00000000 00:00 0 # Here's the test mmap 7f1213de2000-7f1213f67000 r-xp 00000000 fd:01 262980 /lib64/libc-2.14.1.so 7f1213f67000-7f1214167000 ---p 00185000 fd:01 262980 /lib64/libc-2.14.1.so 7f1214167000-7f121416b000 r--p 00185000 fd:01 262980 /lib64/libc-2.14.1.so 7f121416b000-7f121416c000 rw-p 00189000 fd:01 262980 /lib64/libc-2.14.1.so 7f121416c000-7f1214171000 rw-p 00000000 00:00 0 7f1214171000-7f1214191000 r-xp 00000000 fd:01 262991 /lib64/ld-2.14.1.so 7f1214372000-7f1214375000 rw-p 00000000 00:00 0 7f121438f000-7f1214391000 rw-p 00000000 00:00 0 7f1214391000-7f1214392000 r--p 00020000 fd:01 262991 /lib64/ld-2.14.1.so 7f1214392000-7f1214393000 rw-p 00021000 fd:01 262991 /lib64/ld-2.14.1.so 7f1214393000-7f1214394000 rw-p 00000000 00:00 0 7fff0588a000-7fff058ab000 rw-p 00000000 00:00 0 [stack] 7fff0593a000-7fff0593c000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Address range perms offset dev ino usage The perms are: r - read w - write x - exec p - private (its absence means shared) So, we can see that the first three lines cover the test program itself. The next line is the mmap, which with PROT_NONE is actually backed by nothing since it's inaccessible. The next 4 lines are (obv) libc, etc. So that's a a bunch of address space used when in fact only the chunks with "w" and "p" in the permissions need to actually be backed by swap. It's actually a bit more complicated that that, but for the sake of a simple example, it's enough. the cost of allocation with these parameters should be 0, and
the call
mmap((void *) NULL, someSize * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); # someSize is integer with number of Gigs
should succeed no matter what virtual memory, and no matter what size I am allocating.
As far as the kernel generally is concerned, yes. You're hitting user resource limits, not a kernel out-of-memory condition. If you had overcommit disabled, then you might run into that issue.
So it seems that the user resource limit is checked first. Let me ask a speculative question (but I would appreciate a comment as it may help getting Nacl working on Linux, it seems their code is non-reserving 84Gb this way)
:)
So the question: Do you think that on any Linux, with ulimit virtual memory set to X Gb, this call will always fail:
mmap((void *) NULL, (X+1) * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0);
Thanks (one more comment below)
There's no reason to think it will _always_ fail but it'd be easy enough for them to check the failure case and compare it to the ulimit. On my freshly installed 12.1-rc1 system, there's no limit for virtual memory set for my account.
So far I have no other explanation then this is a bug (anyone agrees?). On the chrome list, people indicated same issue was fixed in Ubuntu lately (mmap 84Gb succeds with way less virtual memory). I only have 4 boxes with Opensuse ... will also try Opensuse 12.1 and report
I blame Windows for perpetuating this myth that "virtual memory" = "real memory + swap" since that's not at all what it means. Virtual memory is the address space. You're allocating part of the address space but you're not using it.
That is why I thought the user limit should be checked at the point I actually use it (not during MAP_NORESERVE), but certainly have little support for that :)
Well, that can get tricky. It's certainly possible but you won't like how it will enforce that limit. Rather than return an error code, your process will get a SIGBUS and be killed. - -Jeff
Thanks for your help and comments,
milan
-Jeff
>> let me know if i should run anything else, thanks > > > On my 11.4 64-bit system, I can't reproduce your > failure. I see: > > jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... > Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux > sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 > 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux > > -Jeff > >>>> I am trying to resolve a failure of Google >>>> Native Client in Opensuse 64 11.4, discussed in >>>> Google Native Client forum: >>>> >>>> [1] >>>> https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM >>>> >>>> >>>>
>>>>
>>>> To repeat the issue in a nutshell:: >>>> --------------------------------------------- >>>> >>>> Google native client (Chrome) works on recent >>>> versions of at least Ubuntu [1], but fails on >>>> Opensuse 11.4 (with all latest updates up to Nov >>>> 4). This failure can be reproduced in chrome 14, >>>> 15, 16 (from >>>> http://dl.google.com/linux/chrome/rpm/stable/x86_64) >>>> >>>> and verified by loading >>>> >>>> [2] >>>> http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm... >>>> >>>> >>>>
>>>>
>>>> The problem / question for opensuse (kernel?) :: >>>> ------------------------------------------------------------------ >>>> >>>> >>>>
>>>>
>>>> There is a long discussion in the above thread, to get to the >>>> point quickly: The Google guys identified an >>>> issue with mmap() with MAP_NORESERVE (see below). >>>> They believe it may be a bug or a kernel >>>> configuration issue(?) >>>> >>>> A Chrome Nacl person suggest the following code >>>> should print "Success" but it fails in my >>>> testing: >>>> >>>> #include <stdio.h> #include <sys/mman.h> >>>> >>>> int main(void) { void *addr; >>>> >>>> printf("Hello world\nAllocating 29Gb...\n"); addr >>>> = mmap((void *) NULL, 29 * (((size_t) 1) << 30), >>>> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | >>>> MAP_PRIVATE, -1, 0); /* test 29 or other values >>>> */ if (MAP_FAILED == addr) { printf("FAILED\n"); >>>> } else { printf("Success: %p\n", addr); } return >>>> 0; } >>>> >>>> This prints FAILED on Opensuse 11.4 64 bit. >>>> >>>> I did some experiments. On my system: >>>> >>>> # cat /proc/meminfo MemTotal: 15948428 kB >>>> MemFree: 11270612 kB .... CommitLimit: >>>> 28945728 kB Committed_AS: 4918284 kB ... >>>> >>>> From running the test program above, it looks >>>> like *CommitLimit* is clearly used as upper limit >>>> of mmap(MAP_ANONYMOUS | MAP_NORESERVE | >>>> MAP_PRIVATE), *no matter what >>>> vm.overcommit_memory* flag is used. >>>> >>>> In concrete terms: >>>> >>>> mmap((void *) NULL, 29 * (((size_t) 1) << 30), >>>> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | >>>> MAP_PRIVATE, -1, 0); // always FAILS with value >>>> of 29 or higher >>>> >>>> mmap((void *) NULL, 28 * (((size_t) 1) << 30), >>>> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | >>>> MAP_PRIVATE, -1, 0); // always SUCCEEDS with >>>> value of 28 or lower >>>> >>>> No matter what setting of sysctl -w >>>> vm.overcommit_memory=0 # or 1 or 2 >>>> >>>> (This seems a Opensuse 11.4 bug in modes 0 and 1, >>>> as according to [2] anonymous private readonly >>>> should have 0 cost) >>>> >>>> Any comments or solutions or how to fix this? >>>> >>>> Thanks, >>>> >>>> Milan >>>> >>>> [1] >>>> http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt >>>> >>>>
>>>>
>>>> [2] >>>> http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting >>>> >>>> >>>> >>>> >>>>
>>>>
>>>> ======================= PS: I am attaching a few pieces of info >>>> about my system that may be relevant: >>>> >>>> The hardware is AMD 4 core AMD Athlon II X4 610e >>>> and has 16Gb (sixteen) of memory, running very >>>> little (just KDE desktop at this point) >>>> >>>> # swapon -s Filename Type Size Used Priority >>>> /dev/sda1 partition 20971516 0 -1 >>>> >>>> # cat /proc/sys/vm/overcommit_memory 0 >>>> >>>> # ulimit -a core file size (blocks, -c) >>>> 0 data seg size (kbytes, -d) unlimited scheduling >>>> priority (-e) 0 file size (blocks, >>>> -f) unlimited pending signals (-i) 123980 max >>>> locked memory (kbytes, -l) 64 max memory size >>>> (kbytes, -m) 13556224 open files (-n) 1024 pipe >>>> size (512 bytes, -p) 8 POSIX message >>>> queues (bytes, -q) 819200 real-time priority (-r) >>>> 0 stack size (kbytes, -s) 8192 cpu >>>> time (seconds, -t) unlimited max user processes >>>> (-u) 123980 virtual memory (kbytes, -v) 29536000 >>>> file locks (-x) unlimited >>>> >>>> # df Filesystem 1K-blocks Used >>>> Available Use% Mounted on rootfs >>>> 82567856 15558248 62815408 20% / devtmpfs >>>> 7934744 244 7934500 1% /dev tmpfs >>>> 7974212 1592 7972620 1% /dev/shm /dev/sda2 >>>> 82567856 15558248 62815408 20% / /dev/sda3 >>>> 377510440 90623252 267710692 26% /home > > >>
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJOvD3NAAoJEB57S2MheeWy/JgP/2E0mA89AHJQvQ+Shpe8ezV8 6XThoBQPI5qfpAbtalbXQsT7M2L8q2qgjlP4Gk/54n6MPq44a4Y6PbZkCn+nioIi e+gIiMCFOPRkO1S95s+SSSMlj9B9k1zv7//s35/I2h8zgSyISxsdVUux49sKNbTj dl/86JHS2QsjvtwY/N2bdOCjqz7cUXsEC8nQgobrEapZ9x9zND+vetK03NTstXtk ArdjeWhsbhejGCaz7LpJmszPmz1XSawtb1Jm/omNRGyk3ns2GzT6/AaAqG44tOe9 tybzhDrAx79tYpufABTZOp2sSbIKVNj63/716zXT+d6flnOhUre/e51mm0mea7TW WAbRxsYHNUGPwpjTdugIsBjJjxO76Z6eYhe7Xivw7l8b0wR/iNVhLrhVOBEUC7iY FXz19lRjIYZ6w345h11fhndF/CoEXgOPdWMuvIyE+vtQFQUyyNYsPo8C1Q3MTSfv OUYjtKRgaoo4z/Ngu9HHm1Fo9C2WJsyxmP0d0HS9xdUG4fAp7mVUkiU9136Vljzk rEUVVeXXOWGdIrDfVODTwgy8e+n0LxQ3of6eRAKc7XoB7jdZLgcDxBChCZrXCLxD hDjFjjSs8GU5lqOiL3gAa1TnLPWnqhztVurdY70kHV17BNatRSy6LyVB13SCGz95 qnLErliWms7EUKV4X4zx =+E2z -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
Sorry to jump late in the game. Just few comments. On Thu 10-11-11 16:10:38, Jeff Mahoney wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/10/2011 02:43 PM, milan zimmermann wrote:
On Thu, Nov 10, 2011 at 10:15 AM, Jeff Mahoney <jeffm@suse.de> wrote: On 11/10/2011 11:06 AM, milan zimmermann wrote:
On Wed, Nov 9, 2011 at 3:41 PM, Jeff Mahoney <jeffm@suse.de> wrote: [...] You are right, that is how it works, I agree (my "disagree" was too harsh there :) ).
I should have said I am not sure with these particular flags ulimit should matter, as there should be no memory allocated (reserved). But I am only basic that on half-understanding this:
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
(I think you are describing the difference below, also a question there)
I am, but there's another key difference here. There's a "max memory size" ulimit that controls actual allocations. The "virtual memory" one controls how much of the address space you can use. They're two separate things.
Yes, have a look at may_expand_vm (mm/mmap.c). It is called from all functions that manipulate your address space (do_brk - for heap allocation, acct_stack_growth - from the stack expand routine, mmap_region - for mmap calls, vma_to_resize - for mremap). ulimit is per-process while overcommit is per system. [...]
Overcommit covers memory use that is backed by memory itself plus any swap space you have.
Overcommit basically says how much we are willing to risk at the virtual memory allocation time (not backed by any real memory at the moment) that things will blow up when we fault that memory in (make it resident in the RAM). OVERCOMMIT_GUESS (0) - try to be intelligent and guess what could be dangerous (have a look at __vm_enough_memory for more details) OVERCOMMIT_ALWAYS (1) - says "hey we don't care and we can take a risk of bail out (OOM killer) if things go wrong" - seems to be the current banking model ;) OVERCOMMIT_NEVER (2) - be really strict that nobody allocates more than what is backed by a real memory (RAM+swap + a configurable percentage) As Jeff already noted this affects only anonymous memory and private writable file mappings because those cannot be just dropped to be re-read from the disk later. [...]
Take a look at /proc/<pid>/maps to see what I mean.
/proc/<pid>/smaps is even more descriptive. It will tell you how much memory from the mapping is resident, swapped out etc... [...] HTH Ragards -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
Hi Jeff: Thanks for the details and your time analysing this. I think I will now point this thread to the Nacl list (who provided the testing program) if they can find the right solution - as this all started with Nacl not working on Opensuse 11.4 (while claimed to work on for example Ubuntu). I realize both Nacl and the test code now work on Opensuse 12.1 RC2 (as you pointed out and I tried as well yesterday), but it hindges on the distro setting no "virtual memory" ulimit. So, if I understand correctly, if the Nacl code mmaps huge space without checking "virtual memory" ulimit, Nacl will always be failing on systems with ulimit set to a finite value; I'd like that Nacl works on any Linux distro and installation, in particular Opensuse :) and hope this thread would help in that direction... There are a few comments I added inline below (although I am definitely in over my head area in the details), I am happy with the explanation and hope it will be useful to others. Thanks again for your analysis and explanation, milan On Thu, Nov 10, 2011 at 3:10 PM, Jeff Mahoney <jeffm@suse.de> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 11/10/2011 02:43 PM, milan zimmermann wrote:
On Thu, Nov 10, 2011 at 10:15 AM, Jeff Mahoney <jeffm@suse.de> wrote: On 11/10/2011 11:06 AM, milan zimmermann wrote:
On Wed, Nov 9, 2011 at 3:41 PM, Jeff Mahoney <jeffm@suse.de> wrote:
>>> virtual memory (kbytes, -v) 29536000
... but here is your problem.
I disagree. With the switches I used, and with
Then test it. On my system, I see:
jeffm@jetfire:~> ulimit -v unlimited jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... Success: 0x7fc7732d6000
jeffm@jetfire:~> ulimit -v 28000000 jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... FAILED
You are right, that is how it works, I agree (my "disagree" was too harsh there :) ).
I should have said I am not sure with these particular flags ulimit should matter, as there should be no memory allocated (reserved). But I am only basic that on half-understanding this:
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
(I think you are describing the difference below, also a question there)
I am, but there's another key difference here. There's a "max memory size" ulimit that controls actual allocations. The "virtual memory" one controls how much of the address space you can use. They're two separate things.
yes, understand that, thanks
Here's a quick run down.
Overcommit covers memory use that is backed by memory itself plus any swap space you have.
mmap can be used to assign portions of the address space to objects and not all of them are backed by memory+swap.
yes
For example, your program executable itself, any libraries you load, and any files you mmap() read-only are all backed by the file on-disk unless they're modified by private mappings. The kernel knows that it can drop those pages and read them back in from the file, similar to how it can read swapped pages back from the swap space.
getting a bit hard for me but yes, makes sense
Take a look at /proc/<pid>/maps to see what I mean.
For your test case, I see this:
jeffm@jetfire:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f0ad3de2000 pid=3485
jeffm@jetfire:~> cat /proc/3485/maps 00400000-00401000 r-xp 00000000 fd:03 131888 /home/jeffm/a.out 00600000-00601000 r--p 00000000 fd:03 131888 /home/jeffm/a.out 00601000-00602000 rw-p 00001000 fd:03 131888 /home/jeffm/a.out 7f0ad3de2000-7f1213de2000 ---p 00000000 00:00 0 # Here's the test mmap 7f1213de2000-7f1213f67000 r-xp 00000000 fd:01 262980 /lib64/libc-2.14.1.so 7f1213f67000-7f1214167000 ---p 00185000 fd:01 262980 /lib64/libc-2.14.1.so 7f1214167000-7f121416b000 r--p 00185000 fd:01 262980 /lib64/libc-2.14.1.so 7f121416b000-7f121416c000 rw-p 00189000 fd:01 262980 /lib64/libc-2.14.1.so 7f121416c000-7f1214171000 rw-p 00000000 00:00 0 7f1214171000-7f1214191000 r-xp 00000000 fd:01 262991 /lib64/ld-2.14.1.so 7f1214372000-7f1214375000 rw-p 00000000 00:00 0 7f121438f000-7f1214391000 rw-p 00000000 00:00 0 7f1214391000-7f1214392000 r--p 00020000 fd:01 262991 /lib64/ld-2.14.1.so 7f1214392000-7f1214393000 rw-p 00021000 fd:01 262991 /lib64/ld-2.14.1.so 7f1214393000-7f1214394000 rw-p 00000000 00:00 0 7fff0588a000-7fff058ab000 rw-p 00000000 00:00 0 [stack] 7fff0593a000-7fff0593c000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Address range perms offset dev ino usage
The perms are: r - read w - write x - exec p - private (its absence means shared)
So, we can see that the first three lines cover the test program itself.
ok
The next line is the mmap, which with PROT_NONE is actually backed by nothing since it's inaccessible.
for curiosity waht is PROT_NONE here, thanks: 7f0ad3de2000-7f1213de2000 ---p 00000000 00:00 0 # Here's the test mmap
The next 4 lines are (obv) libc, etc.
ok
So that's a a bunch of address space used when in fact only the chunks with "w" and "p" in the permissions need to actually be backed by swap. It's actually a bit more complicated that that, but for the sake of a simple example, it's enough.
yes, I appreciate the details, thanks
sysctl -w vm.overcommit_memory=0 # or 1
according to the link I posted
http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
the cost of allocation with these parameters should be 0, and
the call
mmap((void *) NULL, someSize * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); # someSize is integer with number of Gigs
should succeed no matter what virtual memory, and no matter what size I am allocating.
As far as the kernel generally is concerned, yes. You're hitting user resource limits, not a kernel out-of-memory condition. If you had overcommit disabled, then you might run into that issue.
So it seems that the user resource limit is checked first. Let me ask a speculative question (but I would appreciate a comment as it may help getting Nacl working on Linux, it seems their code is non-reserving 84Gb this way)
:)
So the question: Do you think that on any Linux, with ulimit virtual memory set to X Gb, this call will always fail:
mmap((void *) NULL, (X+1) * (((size_t) 1) << 30), PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0);
Thanks (one more comment below)
There's no reason to think it will _always_ fail but it'd be easy enough for them to check the failure case and compare it to the ulimit.
yes, this is what I hope can be done, I will point this out on the Nacl list
On my freshly installed 12.1-rc1 system, there's no limit for virtual memory set for my account.
yes, and nacl works there as well as the test code.
So far I have no other explanation then this is a bug (anyone agrees?). On the chrome list, people indicated same issue was fixed in Ubuntu lately (mmap 84Gb succeds with way less virtual memory). I only have 4 boxes with Opensuse ... will also try Opensuse 12.1 and report
I blame Windows for perpetuating this myth that "virtual memory" = "real memory + swap" since that's not at all what it means. Virtual memory is the address space. You're allocating part of the address space but you're not using it.
That is why I thought the user limit should be checked at the point I actually use it (not during MAP_NORESERVE), but certainly have little support for that :)
Well, that can get tricky. It's certainly possible but you won't like how it will enforce that limit. Rather than return an error code, your process will get a SIGBUS and be killed.
Yes, I would accept that part; the consequence would be that most Nacl programs run only those actually go over the limit fail; although if it would kill the browser that is not a good solution either; anyway I am getting speculative here. [I think this is related to the vm.overcommit_memory=0 described here http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting ? (no need to comment I need to stop at some point). Thanks again Milan
- -Jeff
Thanks for your help and comments,
milan
-Jeff
>>> let me know if i should run anything else, thanks >> >> >> On my 11.4 64-bit system, I can't reproduce your >> failure. I see: >> >> jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... >> Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux >> sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 >> 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux >> >> -Jeff >> >>>>> I am trying to resolve a failure of Google >>>>> Native Client in Opensuse 64 11.4, discussed in >>>>> Google Native Client forum: >>>>> >>>>> [1] >>>>> https://groups.google.com/forum/#!topic/native-client-discuss/7DUFfi_BxqM >>>>> >>>>> >>>>>
>>>>>
>>>>> To repeat the issue in a nutshell:: >>>>> --------------------------------------------- >>>>> >>>>> Google native client (Chrome) works on recent >>>>> versions of at least Ubuntu [1], but fails on >>>>> Opensuse 11.4 (with all latest updates up to Nov >>>>> 4). This failure can be reproduced in chrome 14, >>>>> 15, 16 (from >>>>> http://dl.google.com/linux/chrome/rpm/stable/x86_64) >>>>> >>>>> and verified by loading >>>>> >>>>> [2] >>>>> http://www.gonacl.com/dev/demos/sdk_examples/load_progress/load_progress.htm... >>>>> >>>>> >>>>>
>>>>>
>>>>> The problem / question for opensuse (kernel?) :: >>>>> ------------------------------------------------------------------ >>>>> >>>>> >>>>>
>>>>>
>>>>> There is a long discussion in the above thread, to get to the >>>>> point quickly: The Google guys identified an >>>>> issue with mmap() with MAP_NORESERVE (see below). >>>>> They believe it may be a bug or a kernel >>>>> configuration issue(?) >>>>> >>>>> A Chrome Nacl person suggest the following code >>>>> should print "Success" but it fails in my >>>>> testing: >>>>> >>>>> #include <stdio.h> #include <sys/mman.h> >>>>> >>>>> int main(void) { void *addr; >>>>> >>>>> printf("Hello world\nAllocating 29Gb...\n"); addr >>>>> = mmap((void *) NULL, 29 * (((size_t) 1) << 30), >>>>> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | >>>>> MAP_PRIVATE, -1, 0); /* test 29 or other values >>>>> */ if (MAP_FAILED == addr) { printf("FAILED\n"); >>>>> } else { printf("Success: %p\n", addr); } return >>>>> 0; } >>>>> >>>>> This prints FAILED on Opensuse 11.4 64 bit. >>>>> >>>>> I did some experiments. On my system: >>>>> >>>>> # cat /proc/meminfo MemTotal: 15948428 kB >>>>> MemFree: 11270612 kB .... CommitLimit: >>>>> 28945728 kB Committed_AS: 4918284 kB ... >>>>> >>>>> From running the test program above, it looks >>>>> like *CommitLimit* is clearly used as upper limit >>>>> of mmap(MAP_ANONYMOUS | MAP_NORESERVE | >>>>> MAP_PRIVATE), *no matter what >>>>> vm.overcommit_memory* flag is used. >>>>> >>>>> In concrete terms: >>>>> >>>>> mmap((void *) NULL, 29 * (((size_t) 1) << 30), >>>>> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | >>>>> MAP_PRIVATE, -1, 0); // always FAILS with value >>>>> of 29 or higher >>>>> >>>>> mmap((void *) NULL, 28 * (((size_t) 1) << 30), >>>>> PROT_NONE, MAP_ANONYMOUS | MAP_NORESERVE | >>>>> MAP_PRIVATE, -1, 0); // always SUCCEEDS with >>>>> value of 28 or lower >>>>> >>>>> No matter what setting of sysctl -w >>>>> vm.overcommit_memory=0 # or 1 or 2 >>>>> >>>>> (This seems a Opensuse 11.4 bug in modes 0 and 1, >>>>> as according to [2] anonymous private readonly >>>>> should have 0 cost) >>>>> >>>>> Any comments or solutions or how to fix this? >>>>> >>>>> Thanks, >>>>> >>>>> Milan >>>>> >>>>> [1] >>>>> http://www.mjmwired.net/kernel/Documentation/filesystems/proc.txt >>>>> >>>>>
>>>>>
>>>>> [2] >>>>> http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting >>>>> >>>>> >>>>> >>>>> >>>>>
>>>>>
>>>>> ======================= PS: I am attaching a few pieces of info >>>>> about my system that may be relevant: >>>>> >>>>> The hardware is AMD 4 core AMD Athlon II X4 610e >>>>> and has 16Gb (sixteen) of memory, running very >>>>> little (just KDE desktop at this point) >>>>> >>>>> # swapon -s Filename Type Size Used Priority >>>>> /dev/sda1 partition 20971516 0 -1 >>>>> >>>>> # cat /proc/sys/vm/overcommit_memory 0 >>>>> >>>>> # ulimit -a core file size (blocks, -c) >>>>> 0 data seg size (kbytes, -d) unlimited scheduling >>>>> priority (-e) 0 file size (blocks, >>>>> -f) unlimited pending signals (-i) 123980 max >>>>> locked memory (kbytes, -l) 64 max memory size >>>>> (kbytes, -m) 13556224 open files (-n) 1024 pipe >>>>> size (512 bytes, -p) 8 POSIX message >>>>> queues (bytes, -q) 819200 real-time priority (-r) >>>>> 0 stack size (kbytes, -s) 8192 cpu >>>>> time (seconds, -t) unlimited max user processes >>>>> (-u) 123980 virtual memory (kbytes, -v) 29536000 >>>>> file locks (-x) unlimited >>>>> >>>>> # df Filesystem 1K-blocks Used >>>>> Available Use% Mounted on rootfs >>>>> 82567856 15558248 62815408 20% / devtmpfs >>>>> 7934744 244 7934500 1% /dev tmpfs >>>>> 7974212 1592 7972620 1% /dev/shm /dev/sda2 >>>>> 82567856 15558248 62815408 20% / /dev/sda3 >>>>> 377510440 90623252 267710692 26% /home >> >> >>>
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJOvD3NAAoJEB57S2MheeWy/JgP/2E0mA89AHJQvQ+Shpe8ezV8 6XThoBQPI5qfpAbtalbXQsT7M2L8q2qgjlP4Gk/54n6MPq44a4Y6PbZkCn+nioIi e+gIiMCFOPRkO1S95s+SSSMlj9B9k1zv7//s35/I2h8zgSyISxsdVUux49sKNbTj dl/86JHS2QsjvtwY/N2bdOCjqz7cUXsEC8nQgobrEapZ9x9zND+vetK03NTstXtk ArdjeWhsbhejGCaz7LpJmszPmz1XSawtb1Jm/omNRGyk3ns2GzT6/AaAqG44tOe9 tybzhDrAx79tYpufABTZOp2sSbIKVNj63/716zXT+d6flnOhUre/e51mm0mea7TW WAbRxsYHNUGPwpjTdugIsBjJjxO76Z6eYhe7Xivw7l8b0wR/iNVhLrhVOBEUC7iY FXz19lRjIYZ6w345h11fhndF/CoEXgOPdWMuvIyE+vtQFQUyyNYsPo8C1Q3MTSfv OUYjtKRgaoo4z/Ngu9HHm1Fo9C2WJsyxmP0d0HS9xdUG4fAp7mVUkiU9136Vljzk rEUVVeXXOWGdIrDfVODTwgy8e+n0LxQ3of6eRAKc7XoB7jdZLgcDxBChCZrXCLxD hDjFjjSs8GU5lqOiL3gAa1TnLPWnqhztVurdY70kHV17BNatRSy6LyVB13SCGz95 qnLErliWms7EUKV4X4zx =+E2z -----END PGP SIGNATURE-----
-- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
Hi, On Fri 11-11-11 13:03:17, milan zimmermann wrote:
Hi Jeff:
[...]
I realize both Nacl and the test code now work on Opensuse 12.1 RC2 (as you pointed out and I tried as well yesterday), but it hindges on the distro setting no "virtual memory" ulimit.
The question is why Nacl maps such a huge memory mapping with PROT_NONE.
So, if I understand correctly, if the Nacl code mmaps huge space without checking "virtual memory" ulimit, Nacl will always be failing on systems with ulimit set to a finite value;
true
On Thu, Nov 10, 2011 at 3:10 PM, Jeff Mahoney <jeffm@suse.de> wrote:
On 11/10/2011 02:43 PM, milan zimmermann wrote:
On Thu, Nov 10, 2011 at 10:15 AM, Jeff Mahoney <jeffm@suse.de> wrote: On 11/10/2011 11:06 AM, milan zimmermann wrote:
On Wed, Nov 9, 2011 at 3:41 PM, Jeff Mahoney <jeffm@suse.de> [...] The next line is the mmap, which with PROT_NONE is actually backed by nothing since it's inaccessible.
for curiosity waht is PROT_NONE here, thanks:
The third column describes protections rwx if none is set it is PROT_NONE. `p' stands for a private mapping.
7f0ad3de2000-7f1213de2000 ---p 00000000 00:00 0 # Here's the test mmap
[...]
That is why I thought the user limit should be checked at the point I actually use it (not during MAP_NORESERVE), but certainly have little support for that :)
Please note that we are talking about a virtual memory which is created during the mmap time. If you want to check something at runtime (when the mapping is used then you are looking for RSS limit checking). Moreover, MAP_NORESERVE is not considered for the user limit at all because ulimit is simply to enforce the size of the mapping and doesn't care about usage of the mapping. [...] -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
Thanks Jeff / Michal: It looks like based on your help in this thread, the Nacl/Chromium guys filed a bug: http://code.google.com/p/nativeclient/issues/detail?id=2438 Thanks again for your help. A few notes inline: On Mon, Nov 14, 2011 at 4:15 AM, Michal Hocko <mhocko@suse.cz> wrote:
Hi,
On Fri 11-11-11 13:03:17, milan zimmermann wrote:
Hi Jeff:
[...]
I realize both Nacl and the test code now work on Opensuse 12.1 RC2 (as you pointed out and I tried as well yesterday), but it hindges on the distro setting no "virtual memory" ulimit.
The question is why Nacl maps such a huge memory mapping with PROT_NONE.
I am not quite sure. One of the comments in the Nacl list is "Nacl security depends on the ability to allocate 40GB of address space". I think rather then me interpreting you may want to check the original post https://groups.google.com/forum/#!msg/native-client-discuss/7DUFfi_BxqM/36Gw... and the bug nacl filed based on this http://code.google.com/p/nativeclient/issues/detail?id=2438
So, if I understand correctly, if the Nacl code mmaps huge space without checking "virtual memory" ulimit, Nacl will always be failing on systems with ulimit set to a finite value;
true
On Thu, Nov 10, 2011 at 3:10 PM, Jeff Mahoney <jeffm@suse.de> wrote:
On 11/10/2011 02:43 PM, milan zimmermann wrote:
On Thu, Nov 10, 2011 at 10:15 AM, Jeff Mahoney <jeffm@suse.de> wrote: On 11/10/2011 11:06 AM, milan zimmermann wrote:
> On Wed, Nov 9, 2011 at 3:41 PM, Jeff Mahoney <jeffm@suse.de> [...] The next line is the mmap, which with PROT_NONE is actually backed by nothing since it's inaccessible.
for curiosity waht is PROT_NONE here, thanks:
The third column describes protections rwx if none is set it is PROT_NONE. `p' stands for a private mapping.
ok, thanks
7f0ad3de2000-7f1213de2000 ---p 00000000 00:00 0 # Here's the test mmap
[...]
That is why I thought the user limit should be checked at the point I actually use it (not during MAP_NORESERVE), but certainly have little support for that :)
Please note that we are talking about a virtual memory which is created during the mmap time.
yes
If you want to check something at runtime (when the mapping is used then you are looking for RSS limit checking). Moreover, MAP_NORESERVE is not considered for the user limit at all because ulimit is simply to enforce the size of the mapping and doesn't care about usage of the mapping.
Michal, do you mean that ulimit should not be looked at when mmap has MAP_NORESERVE? Ah, maybe not, sounds like you mean ulimit will always be used and limit the mmap size no matter what mmap parameters? Thanks milan
[...] -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic
-- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
On 09/11/11 17:10, Jeff Mahoney wrote:
On 11/09/2011 01:50 PM, milan zimmermann wrote:
Hi (long post)
I am not sure this is the right list; my post last week on Opensuse 64 bit forum did not yield comments so far.
I used Chrome on openSUSE 11.4 64-bit successfully for a few weeks. It's quirks pushed me back to Firefox, though.
That said, what does ulimit -a tell you?
On my 11.4 64-bit system, I can't reproduce your failure. I see:
jeffm@sled2:~> ./a.out Hello world Allocating 29Gb... Success: 0x7f7f79e57000 jeffm@sled2:~> uname -a Linux sled2 2.6.37.6-0.9-desktop #1 SMP PREEMPT 2011-10-19 22:33:27 +0200 x86_64 x86_64 x86_64 GNU/Linux
-Jeff
12,1 x86_64 here. ./a.out cristian@linux-us4g Hello world Allocating 29Gb... FAILED Linux linux-us4g 3.1.0-3-desktop #1 SMP PREEMPT Thu Nov 3 16:13:19 UTC 2011 (0bcf578) x86_64 x86_64 x86_64 GNU/Linux -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
participants (4)
-
Cristian Rodríguez
-
Jeff Mahoney
-
Michal Hocko
-
milan zimmermann