[opensuse-packaging] Random "Job seems to be stuck here" build failures
Hi, I am working with llvm4 package, which is usually building very long time: https://build.opensuse.org/package/show/devel:tools:compiler/llvm4 https://build.opensuse.org/package/show/ home:michalsrb:branches:devel:tools:compiler/llvm4 I am seeing random but frequent build failures where the build log simply ends (sometimes even in the middle of a line) and after long wait the build is terminated. For example:
... [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transforms/Utils/CtorUtils.h [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transforms/Utils/EscapeEnumerator.h [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transf
Job seems to be stuck here, killed. (after 28800 seconds of inactivity)
It never happened in local build. In build service it happens randomly and in different parts of the build. For example in the middle of "make install" in %install section, or in the middle of running tests in %check section, or in the middle of debuginfo extraction. That makes me think that the problem is not in the package, but something is wrong in build service. Could there be something broken in the package? If it is problem in buildservice, can I do something to reduce the chance of it happening? Thanks, Michal Srb -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org
On Donnerstag, 7. September 2017, 13:23:29 CEST wrote Michal Srb:
Hi,
I am working with llvm4 package, which is usually building very long time: https://build.opensuse.org/package/show/devel:tools:compiler/llvm4 https://build.opensuse.org/package/show/ home:michalsrb:branches:devel:tools:compiler/llvm4
I am seeing random but frequent build failures where the build log simply ends (sometimes even in the middle of a line) and after long wait the build is terminated. For example:
... [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transforms/Utils/CtorUtils.h [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transforms/Utils/EscapeEnumerator.h [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transf
Job seems to be stuck here, killed. (after 28800 seconds of inactivity)
It never happened in local build. In build service it happens randomly and in different parts of the build. For example in the middle of "make install" in %install section, or in the middle of running tests in %check section, or in the middle of debuginfo extraction. That makes me think that the problem is not in the package, but something is wrong in build service.
but not during %build? Would point to some IO or disk space problem maybe. Can you detect any pattern in the jobhistory? Eg. It only fails on lamb7x systems or alike? You may need require more disk space then ...
Could there be something broken in the package? If it is problem in buildservice, can I do something to reduce the chance of it happening?
Hard to say, you could try local build using "--vm-type=kvm" to build like on our workers. Can you reproduce it then? It could be also a kernel bug from the used distro. Does it happen only on distro X maybe? In worst case you need to ping me when you see the build is hanging on some worker and I need to trigger a kernel trace to get an idea why it is hanging .... -- Adrian Schroeter email: adrian@suse.de SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) Maxfeldstraße 5 90409 Nürnberg Germany -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org
On čtvrtek 7. září 2017 13:37:16 CEST Adrian Schröter wrote:
It never happened in local build. In build service it happens randomly and in different parts of the build. For example in the middle of "make install" in %install section, or in the middle of running tests in %check section, or in the middle of debuginfo extraction. That makes me think that the problem is not in the package, but something is wrong in build service.
but not during %build?
Sometimes it happens during %build too.
Would point to some IO or disk space problem maybe.
The package has _constraints file that asks for 30GB of disk space, which should be enough. But yes, maybe some IO problem.
Can you detect any pattern in the jobhistory? Eg. It only fails on lamb7x systems or alike?
I have only 7 samples right now, I don't know how to get to logs from older builds, if it is possible. In those 7 samples it failed on SLE_12_SP2, openSUSE_Factory, openSUSE_Leap_42.2 and openSUSE_Leap_42.3. Architectures were x86_64, i586, armv6l. Build hosts were lamb77, lamb76, lamb74, lamb78 and armbuild15.
Hard to say, you could try local build using "--vm-type=kvm" to build like on our workers. Can you reproduce it then?
I tried that and it built correctly. But I tried only once or twice, so I can't tell if I wasn't just lucky.
In worst case you need to ping me when you see the build is hanging on some worker and I need to trigger a kernel trace to get an idea why it is hanging ....
I'll ping you if I catch it happening. Thanks! Michal Srb -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org
On Sep 07 2017, Michal Srb
I am working with llvm4 package, which is usually building very long time: https://build.opensuse.org/package/show/devel:tools:compiler/llvm4 https://build.opensuse.org/package/show/ home:michalsrb:branches:devel:tools:compiler/llvm4
I am seeing random but frequent build failures where the build log simply ends (sometimes even in the middle of a line) and after long wait the build is terminated. For example:
... [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transforms/Utils/CtorUtils.h [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transforms/Utils/EscapeEnumerator.h [14840s] -- Installing: /home/abuild/rpmbuild/BUILDROOT/ llvm4-4.0.1-38.5.i386/usr/include/llvm/Transf
Job seems to be stuck here, killed. (after 28800 seconds of inactivity)
It never happened in local build. In build service it happens randomly and in different parts of the build. For example in the middle of "make install" in %install section, or in the middle of running tests in %check section, or in the middle of debuginfo extraction. That makes me think that the problem is not in the package, but something is wrong in build service.
Could be an OOM situation. Try looking at the resource usage of a succeeding build. For example, https://build.opensuse.org/package/statistics/devel:tools:compiler/llvm4?arch=x86_64&repository=openSUSE_Factory says it needs 6GB of memory, but in _constraints only 4GB are requested. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org
On čtvrtek 7. září 2017 13:41:53 CEST Andreas Schwab wrote:
Could be an OOM situation. Try looking at the resource usage of a succeeding build.
For example, https://build.opensuse.org/package/statistics/devel:tools:compiler/llvm4?arc h=x86_64&repository=openSUSE_Factory says it needs 6GB of memory, but in _constraints only 4GB are requested.
Cool, I didn't know about this statistics page. The peak memory usage is when linking the main libraries, but it gets stuck in other random places. But I'll try to increase the limit anyway. Michal -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org
participants (3)
-
Adrian Schröter
-
Andreas Schwab
-
Michal Srb