[opensuse-buildservice] flapping build
Hi. I've got weird issue with one of the OBS jobs which I don't know how to debug. From time to time the build job fails. It fails with "Illegal instruction" error - the code uses tricky SSE optimizations which are heavily dependent on processor features. The problem is that I do not see any pattern: - any of the ubuntu or debian builds might fail (several or single) - I'm unable to reproduce this locally - the build failure is gone next day without any code changes Overall this seems like some sort of spooky race condition. If I "trigger rebuild" than the build is back to normal. I guess that it depends on what kind VM/CPU the build has been scheduled on. Is there a way to get additional info about the VM on which the build is running? Example build log is attached but I'm unable to make sense out of it. -- Max Suraev <msuraev@sysmocom.de> http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Alt-Moabit 93 * 10559 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Director: Harald Welte
Hey, On 20.07.2017 11:47, Max wrote:
It fails with "Illegal instruction" error - the code uses tricky SSE optimizations which are heavily dependent on processor features.
Find out which & create a build constraint? :-) http://openbuildservice.org/help/manuals/obs-reference-guide/cha.obs.build_j... Henne -- Henne Vogelsang http://www.opensuse.org Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Am Donnerstag, 20. Juli 2017, 12:18:44 CEST schrieb Henne Vogelsang:
Hey,
On 20.07.2017 11:47, Max wrote:
It fails with "Illegal instruction" error - the code uses tricky SSE optimizations which are heavily dependent on processor features.
Find out which & create a build constraint? :-)
http://openbuildservice.org/help/manuals/obs-reference-guide/cha.obs.build_j ob_constraints.html#idm140109221460960
Henne
An additional hint: You can use "osc workerinfo" to find out a bit more about the worker where the job was running on: ## BUILD FAILED: # osc workerinfo x86_64:build36:1 |grep sse <flag>sse</flag> <flag>sse2</flag> <flag>sse4a</flag> <flag>misalignsse</flag> ## BUILD WORKED # osc workerinfo x86_64:cloud117:1 |grep sse <flag>sse</flag> <flag>sse2</flag> <flag>ssse3</flag> <flag>sse4_1</flag> <flag>sse4_2</flag>
Looks very useful, thank you. Where do I get this "build36" or "cloud117" from? also what does ":1" at the end refers to? Basically how do I gather parameters for "osc workerinfo" from the failed build log? On 20.07.2017 12:24, Frank Schreiner wrote:
An additional hint: You can use "osc workerinfo" to find out a bit more about the worker where the job was running on:
## BUILD FAILED: # osc workerinfo x86_64:build36:1 |grep sse <flag>sse</flag> <flag>sse2</flag> <flag>sse4a</flag> <flag>misalignsse</flag>
## BUILD WORKED # osc workerinfo x86_64:cloud117:1 |grep sse <flag>sse</flag> <flag>sse2</flag> <flag>ssse3</flag> <flag>sse4_1</flag> <flag>sse4_2</flag>
-- Max Suraev <msuraev@sysmocom.de> http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Alt-Moabit 93 * 10559 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Director: Harald Welte -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Am Donnerstag, 20. Juli 2017, 12:31:32 CEST schrieb Max:
Looks very useful, thank you. Where do I get this "build36" or "cloud117" from? also what does ":1" at the end refers to?
It is: <arch>:<host>:<vm_num> In the log you sent, you can find a line like [ 0s] build36 started "build libosmocore_0.9.6.20170719.dsc" at Wed Jul 19 19:49:50 UTC 2017. build36 is the host. Here you can find the arch (maybe there is a better solution with osc - I don`t know) https://build.opensuse.org/monitor and vm_num=1 is always a good idea. Normally the other VM`s should be the same.
Basically how do I gather parameters for "osc workerinfo" from the failed build log? On 20.07.2017 12:24, Frank Schreiner wrote:
An additional hint: You can use "osc workerinfo" to find out a bit more about the worker where the job was running on:
## BUILD FAILED: # osc workerinfo x86_64:build36:1 |grep sse
<flag>sse</flag> <flag>sse2</flag> <flag>sse4a</flag> <flag>misalignsse</flag>
## BUILD WORKED # osc workerinfo x86_64:cloud117:1 |grep sse
<flag>sse</flag> <flag>sse2</flag> <flag>ssse3</flag> <flag>sse4_1</flag> <flag>sse4_2</flag>
Exactly what I was looking for, thank you. -- Max Suraev <msuraev@sysmocom.de> http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Alt-Moabit 93 * 10559 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Director: Harald Welte -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
The <arch> is the "hostarch=" parameter of the worker obtained from "osc api /worker/_status" or it's arch of the failed build? For example I got build for i586 failed while x86_64 build is ok (were on different workers). Getting data on the worker of failed i586 build: osc api /worker/_status|grep build34 <idle workerid="build34:1" hostarch="x86_64"/> <idle workerid="build34:2" hostarch="x86_64"/> <idle workerid="build34:3" hostarch="x86_64"/> <idle workerid="build34:5" hostarch="x86_64"/> <idle workerid="build34:6" hostarch="x86_64"/> <building workerid="build34:4" hostarch="x86_64" project="Kernel:linux-next" repository="standard" package="kernel-vanilla" arch="i586" starttime="1500631045"/> Getting sse flags: osc workerinfo x86_64:build34:1 |grep sse <flag>sse</flag> <flag>sse2</flag> <flag>sse4a</flag> <flag>misalignsse</flag> but osc workerinfo x586:build34:1 |grep sse Server returned an error: HTTP Error 404: remote error unknown worker remote error: unknown worker On 20.07.2017 12:43, Frank Schreiner wrote:
It is:
<arch>:<host>:<vm_num>
In the log you sent, you can find a line like
[ 0s] build36 started "build libosmocore_0.9.6.20170719.dsc" at Wed Jul 19 19:49:50 UTC 2017.
build36 is the host. Here you can find the arch (maybe there is a better solution with osc - I don`t know)
https://build.opensuse.org/monitor
and vm_num=1 is always a good idea. Normally the other VM`s should be the same.
-- Max Suraev <msuraev@sysmocom.de> http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Alt-Moabit 93 * 10559 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Director: Harald Welte -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Freitag, 21. Juli 2017, 13:41:55 CEST wrote Max:
The <arch> is the "hostarch=" parameter of the worker obtained from "osc api /worker/_status" or it's arch of the failed build?
The hostarch is the architecture from the host ;) It is independend of the build target. i586, but also emulated architectures (eg armv6) can be build on a x86_64 host.
For example I got build for i586 failed while x86_64 build is ok (were on different workers).
Getting data on the worker of failed i586 build: osc api /worker/_status|grep build34 <idle workerid="build34:1" hostarch="x86_64"/> <idle workerid="build34:2" hostarch="x86_64"/> <idle workerid="build34:3" hostarch="x86_64"/> <idle workerid="build34:5" hostarch="x86_64"/> <idle workerid="build34:6" hostarch="x86_64"/> <building workerid="build34:4" hostarch="x86_64" project="Kernel:linux-next" repository="standard" package="kernel-vanilla" arch="i586" starttime="1500631045"/>
Getting sse flags: osc workerinfo x86_64:build34:1 |grep sse <flag>sse</flag> <flag>sse2</flag> <flag>sse4a</flag> <flag>misalignsse</flag>
but osc workerinfo x586:build34:1 |grep sse
Server returned an error: HTTP Error 404: remote error unknown worker remote error: unknown worker
Right, there is no i586 worker host, only a x86_64 which can also run i586 via either booting 32-bit kernel or via personality switch. However, it is up to the kernel and cpu then to offer the optimizations in 32bit legacy mode. I suppose that at least sse4a won't be available there. But this is all content from POV of OBS, we only offer the VM here.
On 20.07.2017 12:43, Frank Schreiner wrote:
It is:
<arch>:<host>:<vm_num>
In the log you sent, you can find a line like
[ 0s] build36 started "build libosmocore_0.9.6.20170719.dsc" at Wed Jul 19 19:49:50 UTC 2017.
build36 is the host. Here you can find the arch (maybe there is a better solution with osc - I don`t know)
https://build.opensuse.org/monitor
and vm_num=1 is always a good idea. Normally the other VM`s should be the same.
-- Adrian Schroeter email: adrian@suse.de SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) Maxfeldstraße 5 90409 Nürnberg Germany -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Excellent advice, thank you. Is there some sort of negation operator for those constraints? I mean setting bunch of flags would most likely fix our build, but it would be nice to fix our code instead - to find particular combination of present/absent flags causes build failure. Is there a way to constrain build job to workers _without_ cpu flag sse2 for example? On 20.07.2017 12:18, Henne Vogelsang wrote:
Find out which & create a build constraint? :-)
http://openbuildservice.org/help/manuals/obs-reference-guide/cha.obs.build_j...
-- Max Suraev <msuraev@sysmocom.de> http://www.sysmocom.de/ ======================================================================= * sysmocom - systems for mobile communications GmbH * Alt-Moabit 93 * 10559 Berlin, Germany * Sitz / Registered office: Berlin, HRB 134158 B * Geschaeftsfuehrer / Managing Director: Harald Welte -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
participants (4)
-
Adrian Schröter
-
Frank Schreiner
-
Henne Vogelsang
-
Max