[opensuse] Lower memory issue
We are running opensuse 10.3 X86 with 2.6.22.5-31-bigsmp #1 SMP kernel, on a Dell PowerEdge with 8 GB memory, and 2 X Quad Core Intel Xeon 3.0 Ghz The swap memory is set to be 2 GB. The application running is a Java application. The problem that we are currently having is oom-killer always sniping processes on this server after it runs for a couple days. We tracked down the issue to lower memory issue. When the server is freshly restarted, we have about 800 MB free lower memory, then as time goes, the # used one keep going up and eventually OOM-killer will start sniping processes. The processes could be sendmail, even to bash shell. We noted that oom-killer snipes processes even though the server is not using any swap at all, and there is plenty of high memory left. Is there anything that we can do to solve this issue? I believe bigsmp kernel is supposed to be able to handle memory up to 64 GB. We can't really switch the application (even though it is a java based application) to 64 bits O/S unfortunately. So our option is really limited to either finding a fix for this, or lower the memory on this box to 4 GB which is not something we plan to do unless it is absolutely necessary. We also have another box with the same O/S, kernel, and hardware spec that is running Tomcat and have the same issue. But on this box, due to the load of the application, it takes many weeks before the lower memory is exhausted. Thanks output of free -lm total used free shared buffers cached Mem: 8115 545 7570 0 93 118 Low: 821 150 670 High: 7294 394 6900 -/+ buffers/cache: 333 7782 Swap: 2047 0 2047 java invoked oom-killer: gfp_mask=0xd0, order=1, oomkilladj=0 [<c0159b12>] out_of_memory+0x69/0x1a7 [<c015b0a6>] __alloc_pages+0x219/0x2d6 [<c015b18f>] __get_free_pages+0x2c/0x3a [<c01253c7>] copy_process+0xa4/0x10c5 [<c013571c>] alloc_pid+0x1f1/0x24e [<c0126654>] do_fork+0x9a/0x1c2 [<c01d81b3>] copy_to_user+0x25/0x3a [<c01031d6>] sys_clone+0x36/0x3b [<c0104f22>] syscall_call+0x7/0xb [<c02c0000>] _decode_session4+0x1c5/0x1cd ======================= Mem-info: DMA per-cpu: CPU 0: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 1: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 2: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 3: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 4: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 5: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 6: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 CPU 7: Hot: hi: 0, btch: 1 usd: 0 Cold: hi: 0, btch: 1 usd: 0 Normal per-cpu: CPU 0: Hot: hi: 186, btch: 31 usd: 37 Cold: hi: 62, btch: 15 usd: 47 CPU 1: Hot: hi: 186, btch: 31 usd: 26 Cold: hi: 62, btch: 15 usd: 58 CPU 2: Hot: hi: 186, btch: 31 usd: 131 Cold: hi: 62, btch: 15 usd: 47 CPU 3: Hot: hi: 186, btch: 31 usd: 144 Cold: hi: 62, btch: 15 usd: 61 CPU 4: Hot: hi: 186, btch: 31 usd: 89 Cold: hi: 62, btch: 15 usd: 49 CPU 5: Hot: hi: 186, btch: 31 usd: 29 Cold: hi: 62, btch: 15 usd: 11 CPU 6: Hot: hi: 186, btch: 31 usd: 23 Cold: hi: 62, btch: 15 usd: 55 CPU 7: Hot: hi: 186, btch: 31 usd: 23 Cold: hi: 62, btch: 15 usd: 8 HighMem per-cpu: CPU 0: Hot: hi: 186, btch: 31 usd: 1 Cold: hi: 62, btch: 15 usd: 10 CPU 1: Hot: hi: 186, btch: 31 usd: 152 Cold: hi: 62, btch: 15 usd: 6 CPU 2: Hot: hi: 186, btch: 31 usd: 173 Cold: hi: 62, btch: 15 usd: 0 CPU 3: Hot: hi: 186, btch: 31 usd: 175 Cold: hi: 62, btch: 15 usd: 8 CPU 4: Hot: hi: 186, btch: 31 usd: 16 Cold: hi: 62, btch: 15 usd: 9 CPU 5: Hot: hi: 186, btch: 31 usd: 100 Cold: hi: 62, btch: 15 usd: 12 CPU 6: Hot: hi: 186, btch: 31 usd: 29 Cold: hi: 62, btch: 15 usd: 1 CPU 7: Hot: hi: 186, btch: 31 usd: 55 Cold: hi: 62, btch: 15 usd: 5 Active:93948 inactive:13893 dirty:1037 writeback:0 unstable:0 free:1759319 slab:3144 mapped:8179 pagetables:447 bounce:0 DMA free:3556kB min:68kB low:84kB high:100kB active:0kB inactive:0kB present:16256kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 873 8874 Normal free:3664kB min:3744kB low:4680kB high:5616kB active:1372kB inactive:952kB present:894080kB pages_scanned:3206 all_unreclaimable? yes lowmem_reserve[]: 0 0 64008 HighMem free:7030056kB min:512kB low:9096kB high:17684kB active:374420kB inactive:54620kB present:8193024kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 0*4kB 2*8kB 1*16kB 2*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3552kB Normal: 0*4kB 0*8kB 1*16kB 0*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3664kB HighMem: 434*4kB 134*8kB 85*16kB 73*32kB 35*64kB 10*128kB 230*256kB 126*512kB 63*1024kB 38*2048kB 1649*4096kB = 7030056kB Swap cache: add 0, delete 0, find 0/0, race 0+0 Free swap = 2097144kB Total swap = 2097144kB Free swap: 2097144kB 2293760 pages of RAM 2064384 pages of HIGHMEM 216101 reserved pages 18750 pages shared 0 pages swap cached 1037 pages dirty 0 pages writeback 8179 pages mapped 3144 pages slab 447 pages pagetables -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Tue, Sep 2, 2008 at 6:10 PM, Irwan Hadi
We are running opensuse 10.3 X86 with 2.6.22.5-31-bigsmp #1 SMP kernel, on a Dell PowerEdge with 8 GB memory, and 2 X Quad Core Intel Xeon 3.0 Ghz The swap memory is set to be 2 GB.
The application running is a Java application.
Just to be sure... Swap was mounted ? swapon -a ? -- ----------JSA--------- Someone stole my tag line, so now I have this rental. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
swap is mounted, but again when oom-kill is invoked by kernel, the
server is not using any swap space at all.
In fact from the 8 GB that the box has, it only uses about 1.6 GB of
RAM according to top output.
Thanks
On Tue, Sep 2, 2008 at 8:24 PM, John Andersen
On Tue, Sep 2, 2008 at 6:10 PM, Irwan Hadi
wrote: We are running opensuse 10.3 X86 with 2.6.22.5-31-bigsmp #1 SMP kernel, on a Dell PowerEdge with 8 GB memory, and 2 X Quad Core Intel Xeon 3.0 Ghz The swap memory is set to be 2 GB.
The application running is a Java application.
Just to be sure...
Swap was mounted ? swapon -a ?
-- ----------JSA--------- Someone stole my tag line, so now I have this rental.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Tue, Sep 2, 2008 at 11:29 PM, Irwan Hadi
swap is mounted, but again when oom-kill is invoked by kernel, the server is not using any swap space at all. In fact from the 8 GB that the box has, it only uses about 1.6 GB of RAM according to top output.
Thanks
On Tue, Sep 2, 2008 at 8:24 PM, John Andersen
wrote: On Tue, Sep 2, 2008 at 6:10 PM, Irwan Hadi
wrote: We are running opensuse 10.3 X86 with 2.6.22.5-31-bigsmp #1 SMP kernel, on a Dell PowerEdge with 8 GB memory, and 2 X Quad Core Intel Xeon 3.0 Ghz
I missed the part where you said why that box wasn't running a 64 bit kernel (not that I think its necessarily germane.) -- ----------JSA--------- Someone stole my tag line, so now I have this rental. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Tuesday 02 September 2008 18:10, Irwan Hadi wrote:
We are running opensuse 10.3 X86 with 2.6.22.5-31-bigsmp #1 SMP kernel, on a Dell PowerEdge with 8 GB memory, and 2 X Quad Core Intel Xeon 3.0 Ghz The swap memory is set to be 2 GB.
The application running is a Java application.
...
We noted that oom-killer snipes processes even though the server is not using any swap at all, and there is plenty of high memory left.
Is there anything that we can do to solve this issue? I believe bigsmp kernel is supposed to be able to handle memory up to 64 GB. We can't really switch the application (even though it is a java based application) to 64 bits O/S unfortunately. ...
I'm not sure you're analyzing the problem correctly: 1) 32-bit Java applications can use no more than 2 GB each. 2) Java applications do not intrinsically have a 32-bit or 64-bit memory model. That is determined only by the JVM on which they're executing. Thus any Java application can avail itself of a larger virtual (and physical) address space by the simple expedient of running it under a 64-bit JVM. That, of course, requires a 64-bit OS.
...
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Tue, Sep 2, 2008 at 8:59 PM, Randall R Schulz
I'm not sure you're analyzing the problem correctly:
1) 32-bit Java applications can use no more than 2 GB each.
2) Java applications do not intrinsically have a 32-bit or 64-bit memory model. That is determined only by the JVM on which they're executing. Thus any Java application can avail itself of a larger virtual (and physical) address space by the simple expedient of running it under a 64-bit JVM. That, of course, requires a 64-bit OS.
Yes, I know that on 32 bit platform, each Java process can only use up to 2 GB of memory. The application is configured with Xms and Xmx parameter as 1024m each, and even with this the OOM-killer is still sniping processes on this box (being sendmail, bash, and the java application itself). I also must rephrase my original comment, the problem really is not with Java per say, but the fact that even though the box has 8 GB of RAM, the kernel invokes OOM-killer when the box is only using 1.5 GB of memory, and not using any swap at all. We traced the issue to lower memory issue, where after a couple days, the amount of low memory will decrease from 800 MB to 8 MB and lower, and this is what invoked the OOM-killer. So the question is what is the root cause of this issue? Is this with the Java application?, or is there an updated bigsmp kernel with this oom-killer fixed, and if there is no fix available, how can we make sure the kernel's OOM-killer won't snipe the processes on this box (processes being sendmail, bash, to the java application itself) We tried turning off the oom-killer with the following command as we are debugging the issue, and we got "no such file or directory" echo "0" > /proc/sys/vm/oom-kill Thanks Thanks
...
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hi Irwan, Am Mittwoch, 3. September 2008 08:41 schrieb Irwan Hadi:
We tried turning off the oom-killer with the following command as we are debugging the issue, and we got "no such file or directory" echo "0" > /proc/sys/vm/oom-kill
try this: echo 2 > /proc/sys/vm/overcommit_memory back to normal echo 0 > /proc/sys/vm/overcommit_memory HTH Ralf -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Irwan Hadi escribió:
So the question is what is the root cause of this issue? Is this with the Java application?,
We dont know because you have not shown any evidence whatsoever about your problem, but it can be at least two things a) a Java bug, update your JVM to a recent version and/or bug Sun about it. b) Your code needs to be profiled in order to figure wth is going on .. bug your programmers ;-)
echo "0" > /proc/sys/vm/oom-kill
no not like that ! ;P echo 2 > /proc/sys/vm/overcommit_memory ps: just use a 64 bit OS. -- "A computer is like an Old Testament god, with a lot of rules and no mercy. " Cristian Rodríguez R. Platform/OpenSUSE - Core Services SUSE LINUX Products GmbH Research & Development http://www.opensuse.org/
participants (5)
-
Cristian Rodríguez
-
Irwan Hadi
-
John Andersen
-
Ralf Meyer
-
Randall R Schulz