[opensuse-buildservice] Out of memory: Kill process 5593 (python) score 304 or sacrifice child
Hi, Anyone know what case will trigger such a kill and how to avoid it? I was preparing a Japanese input library: libkkc here: https://build.opensuse.org/package/live_build_log?arch=x86_64&package=libkkc... upstream requires some n-gram dictionary data to be processed by python and it takes so long (50000+ strings), enough long for OBS to kill it. I need that lib and its data...if can't do it on OBS, I have to manually process it, but it will violate our no binary policy... Any workarounds? Thanks Marguerite -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Fri, Apr 26, 2013 at 03:20:53AM +0800, Marguerite Su wrote:
upstream requires some n-gram dictionary data to be processed by python and it takes so long (50000+ strings), enough long for OBS to kill it.
I need that lib and its data...if can't do it on OBS, I have to manually process it, but it will violate our no binary policy...
It runs out of memory. constraints? https://build.opensuse.org/package/view_file?expand=1&file=_constraints&pack... Regards, Martin -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Fri, Apr 26, 2013 at 4:53 AM, Martin Koegler <martin.koegler@chello.at> wrote:
It runs out of memory. constraints?
Hi, Martin It works for i586, but x86_64 is still out of memory. I have given it 4GB memory. (on a local machine with 3GB ram, I can build it, just 98% cpu usage) so should I increase memory again, or give it some cpu(possible)? Thanks Marguerite -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Fri, Apr 26, 2013 at 12:06 PM, Marguerite Su <i@marguerite.su> wrote:
On Fri, Apr 26, 2013 at 4:53 AM, Martin Koegler <martin.koegler@chello.at> wrote:
It runs out of memory. constraints?
Hi, Martin
It works for i586, but x86_64 is still out of memory.
I have given it 4GB memory. (on a local machine with 3GB ram, I can build it, just 98% cpu usage)
so should I increase memory again, or give it some cpu(possible)?
x86_64 will require about 50% more RAM (6G), assuming regular-sized strings (word-size). You could also ask upstream to optimize a bit. Or optimize yourself if you can. For instance, holding unicode strings in memory takes a hefty amount of ram, although it's recommended practice to always work with unicode, when you're handling this much data it's not so hot. So, just encoding all in utf8, in-memory, will shrink your memory usage by a lot (string data itself will be one quarter size, though pointers and PyObject headers will still be big). That's a farily simple (albeit tricky) optimization. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Hi, Claudio, On Fri, Apr 26, 2013 at 11:22 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
x86_64 will require about 50% more RAM (6G), assuming regular-sized strings (word-size).
but the 3GB machine is powered by openSUSE x86_64. (I just wonder why a 28mb data can cost so much ram...)
You could also ask upstream to optimize a bit. Or optimize yourself if you can.
For instance, holding unicode strings in memory takes a hefty amount of ram, although it's recommended practice to always work with unicode, when you're handling this much data it's not so hot. So, just encoding all in utf8, in-memory, will shrink your memory usage by a lot (string data itself will be one quarter size, though pointers and PyObject headers will still be big). That's a farily simple (albeit tricky) optimization.
Sorry...I don't know coding, I can't even understand your theory in such a plain english language... -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Fri, Apr 26, 2013 at 12:54 PM, Marguerite Su <i@marguerite.su> wrote:
x86_64 will require about 50% more RAM (6G), assuming regular-sized strings (word-size).
but the 3GB machine is powered by openSUSE x86_64. (I just wonder why a 28mb data can cost so much ram...)
Um... swap usage? It's not uncommon to need lots of RAM to *generate* 28mb worth of distilled data. Monitor resource usage (RAM, cpu and swap) during your local build. It may prove useful. Also, do you have a link at browsable code? ie: upstream's git/svn repo or something? If so I could take a look and perhaps give a suggestion or two. No promises though. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Sat, Apr 27, 2013 at 12:00 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
Um... swap usage?
Yep, I noticed the python _process_ do use swap, but the ram use wasn't even full. I mean, I have 3GB ram, but it uses about 40%. and it still use swap. Its cpu usage is always 98%~100%. But on OBS, it complains out of memory, even in a 4GB suitcase.
Monitor resource usage (RAM, cpu and swap) during your local build. It may prove useful.
I'll print my "top -d 1 -p <pid>" later.
Also, do you have a link at browsable code? ie: upstream's git/svn repo or something? If so I could take a look and perhaps give a suggestion or two. No promises though.
Sure, http://gitorious.org/libkkc You can take a look at libkkc-data/tools/sortlm.py and libkkc-data/data/models Actually this is a planned feature for Fedora 19. I "stole" it from their idea page. But now I suspect it may be not solid enough to act as their "default" Japanese IME. I don't think their koji can afford its building. and our fcitx developers have also examined its weakness in code. it just crash. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Fri, Apr 26, 2013 at 2:58 PM, Marguerite Su <i@marguerite.su> wrote:
On Sat, Apr 27, 2013 at 12:00 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
Um... swap usage?
Yep, I noticed the python _process_ do use swap, but the ram use wasn't even full.
I mean, I have 3GB ram, but it uses about 40%. and it still use swap.
There's your problem then. I don't think OBS workers have swap.
Also, do you have a link at browsable code? ie: upstream's git/svn repo or something? If so I could take a look and perhaps give a suggestion or two. No promises though.
Sure, http://gitorious.org/libkkc
You can take a look at libkkc-data/tools/sortlm.py and libkkc-data/data/models
I don't see any input files in models. But from the code, ram usage seems to be linear on the input, except maybe for the Trie. I don't think the sorting phase can use more than 10 times the size of the input files. Does it die before building the trie? If so... you're SOL I'd say. The trie is another project based on a C library, and tries aren't known for their compactness. You'll just have to request enough RAM to build. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Sat, Apr 27, 2013 at 6:45 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
I mean, I have 3GB ram, but it uses about 40%. and it still use swap.
There's your problem then. I don't think OBS workers have swap.
Hi, Claudio, I'm absolutely beginner to shell, so I may be misleading. here's the `top -d 1 -pid <pid>` info, check it: When compiling data: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27447 marguer+ 20 0 1050m 1.0g 2596 R 89.9 33.6 4:52.83 python When finished and writing data: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27447 marguer+ 20 0 1376m 1.2g 772 D 1.0 41.5 6:10.79 python I seemed to take percent of memory usage as absolute memory usage... There might be 1.2g for a single pid. But it used 2 pids on my dual core laptop. So is that mean on a 4 core VM it will eat 1.2x4 = 4.8g or 50% more = 9.6g memories?
I don't see any input files in models. But from the code, ram usage seems to be linear on the input, except maybe for the Trie. I don't think the sorting phase can use more than 10 times the size of the input files. Does it die before building the trie?
It takes data/models/text3/data.arpa (150mb) as input. I don't know how to distiguish load phrase and sort phrase, but OBS kill it here: [ 71s] /usr/bin/python -B ../../tools/sortlm.py \ [ 71s] ./text3/data.arpa sorted3/data [ 8016s] [ 7998.603176] Out of memory: Kill process 5630 (python) score 259 or sacrifice child [ 8016s] [ 7998.604144] Killed process 5630 (python) total-vm:1357352kB, anon-rss:836216kB, file-rss:0kB The actually resuilt should be: /usr/bin/python -B ../../tools/sortlm.py \ ./text3/data.arpa sorted3/data /usr/bin/python -B ../../tools/sortlm.py \ ./text3/data.arpa sorted3/data reading N-grams reading N-grams min cost = -5.847901 writing 1-gram file writing 2-gram file min cost = -5.847901 writing 1-gram file writing 2-gram file writing 3-gram file writing 3-gram file /usr/bin/python -B ../../tools/genfilter.py \ sorted3/data.2gram \ sorted3/data.2gram.filter \ 12 8 /usr/bin/python -B ../../tools/genfilter.py \ sorted3/data.3gram \ sorted3/data.3gram.filter \ 10 8 /usr/bin/python -B ../../tools/genfilter.py \ sorted3/data.2gram \ sorted3/data.2gram.filter \ 12 8 /usr/bin/python -B ../../tools/sortlm.py \ ./text3/data.arpa sorted3/data reading N-grams /usr/bin/python -B ../../tools/genfilter.py \ sorted3/data.3gram \ sorted3/data.3gram.filter \ 10 8 min cost = -5.847901 writing 1-gram file writing 2-gram file writing 3-gram file /usr/bin/python -B ../../tools/genfilter.py \ sorted3/data.2gram \ sorted3/data.2gram.filter \ 12 8 /usr/bin/python -B ../../tools/genfilter.py \ sorted3/data.3gram \ sorted3/data.3gram.filter \ 10 8
If so... you're SOL I'd say. The trie is another project based on a C library, and tries aren't known for their compactness. You'll just have to request enough RAM to build.
How many memory will be the top a single build can request without harming the others? Greetings Marguerite -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Marguerite Su <i@marguerite.su> writes:
When compiling data:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27447 marguer+ 20 0 1050m 1.0g 2596 R 89.9 33.6 4:52.83 python
When finished and writing data:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27447 marguer+ 20 0 1376m 1.2g 772 D 1.0 41.5 6:10.79 python
I seemed to take percent of memory usage as absolute memory usage...
That doesn't show the peak usage. Try disabling parallel build. The currently running worker has 3GB of RAM and 2GB of swap, which can be tight if each process is using, say, max 2.5GB of virtual memory. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Sat, Apr 27, 2013 at 7:55 AM, Marguerite Su <i@marguerite.su> wrote:
There might be 1.2g for a single pid.
But it used 2 pids on my dual core laptop.
So is that mean on a 4 core VM it will eat 1.2x4 = 4.8g or 50% more = 9.6g memories?
Ah, this is classic. You have to stop using _smp_opts in your make, and force just one worker (use make -j1). -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Sun, Apr 28, 2013 at 2:05 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
Ah, this is classic. You have to stop using _smp_opts in your make, and force just one worker (use make -j1).
yep, so classic. I just `make` and it don't even need _constriants problem is solved. Thank you all the participants Marguerite -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Claudio Freire <klaussfreire@gmail.com> writes:
I don't think OBS workers have swap.
Yes, they do. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Hi, guys, I found a workaround: 1. use _constriants as Martin suggests. (seems the most memory you can request is 4GB, I increased it to 8GB, and it'll always hold back at scheduled status.) 2. limit the thread. use make -j2 instead of make %{?_smp_mflags} which will be -j4. Greetings Marguerite -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Saturday 2013-04-27 17:25, Marguerite Su wrote:
Hi, guys,
I found a workaround:
1. use _constriants as Martin suggests. (seems the most memory you can request is 4GB, I increased it to 8GB, and it'll always hold back at scheduled status.)
2. limit the thread.
use make -j2 instead of make %{?_smp_mflags} which will be -j4.
If you are going to override it, at least use make %{?_smp_mflags} -j2 That way, additional options to limit parallelism (like -l) can get used. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Friday 2013-04-26 19:58, Marguerite Su wrote:
Sure, http://gitorious.org/libkkc
Actually this is a planned feature for Fedora 19. I "stole" it from their idea page. But now I suspect it may be not solid enough to act as their "default" Japanese IME.
I don't think their koji can afford its building. and our fcitx developers have also examined its weakness in code. it just crash.
Why would I, as a user, want kkc over mozc? -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Tue, Apr 30, 2013 at 7:54 PM, Jan Engelhardt <jengelh@inai.de> wrote:
Actually this is a planned feature for Fedora 19. I "stole" it from their idea page. But now I suspect it may be not solid enough to act as their "default" Japanese IME.
I don't think their koji can afford its building. and our fcitx developers have also examined its weakness in code. it just crash.
Why would I, as a user, want kkc over mozc?
Relax, it's not for openSUSE. I was just broadcasting news. Changing a default IME needs at least the permission of the coordinator of the primary user group, and a hearing on #opensuse-factory. I'm not Japanese, and so far as I know, they have no such plans. I packaged it just for providing diversity because IME for openSUSE Japanese is in shortage. Below are my personal options, not releated to openSUSE community, but as an IM packager and a IM developer's friend (Just for sharing): Technically, I can tell kkc is using a N-gram model and a larger statistical data provied by marisa-trie. I don't know in details ( I just do minor dev and artwork in Fcitx project) and didn't test it at all. but it looks N is better than one or two. In license, Mozc is open source but close developed by Google, community can't shape its way. So you can take it "half open" because the "upstream first" rule doesn't apply. Then it's not suitable to compare it with a full open source project. Consequently Japanese has only one open source IME project still running, Anthy, which is old and buggy, even Japanese doesn't use it. (There's another one called SKK, but seems it has been freezing for a long time) So politically and with maintenance in consideration, Red Hat needs a tool that they can controll and fix without increasing man power. If I remembered correct, the developer of KKC is a hired empolyee/Fedora i18n team member, so it's easier to maintain a thing that written entirely by yourself. But I can tell on behalf of myself: 1. Just like the IBus integration for GNOME, outsiders always can't finish insiders' work. Someone may be good at coding, but not good at using. So things they developed may just look weird in users' eyes. But, well, Linux world is open source, you can do anything you want...even take users as your white rats. It's ethically unpleasant but in theory doable. 2. They always think "benefit to Fedora" and didn't take the whole environment into consideration. In short, it's IBus orientated. the KKC library works buggy (until 0.1.10) if you didn't call it in IBus way. It's okay, we can emulate. But considering that IBus doesn't work good in KDE and even GNOME 3.6+ (it's not considered stable until 3.9+), then things might be worse. It may be a very good lib, but the performance on a buggy integration might not be too well. So Fedora people are still fixing it, ABI changes frequently, makes it harder for other platforms to follow and give a port to that good lib. Anyway changes are good if they work carefully. Greetings Marguerite -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
participants (5)
-
Andreas Schwab
-
Claudio Freire
-
Jan Engelhardt
-
Marguerite Su
-
Martin Koegler