[Bug 974419] New: gdb transient build hang in gdb-orphanripper on Ring:1 for ppc64le
http://bugzilla.suse.com/show_bug.cgi?id=974419 Bug ID: 974419 Summary: gdb transient build hang in gdb-orphanripper on Ring:1 for ppc64le Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: PowerPC-64 OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: bnc-team-screening@forge.provo.novell.com Reporter: normand@linux.vnet.ibm.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- gdb transient build hang in gdb-orphanripper on Ring:1 for ppc64le since a while the gdb package build hang in openSUSE:Factory:Rings:1-MinimalX with log like attached gdb_obs_ring1_ppc64le_hung.log.gz I was able to recreate the hang condition in a ppc64le KVM guest to capture the process tree as detailed in attached gdb_twppc64le_hang_while_loop.log.gz At time of hang condition the gdb-orphanripper seems to be hung with a defunct child: === 3517 ? Ss 0:00 | \_ rpmbuild -ba --define _srcdefattr (-,root,root) --nosignature --define _build_create_debug 1 /home/abuild/rpmbuild/SOURCES/gdb.spec 31941 ? S 0:00 | \_ /bin/sh -e /var/tmp/rpm-tmp.NTNSMQ 31947 ? S 0:00 | \_ /bin/sh -e /var/tmp/rpm-tmp.NTNSMQ 31972 ? S 0:00 | \_ ./orphanripper make -j8 -k check//unix/-m64 check//unix/-m64/-fPIC/-pie 31973 ? Zs 0:00 | \_ [make] <defunct> === -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=974419
http://bugzilla.suse.com/show_bug.cgi?id=974419#c1
--- Comment #1 from Michel Normand
http://bugzilla.suse.com/show_bug.cgi?id=974419
http://bugzilla.suse.com/show_bug.cgi?id=974419#c2
--- Comment #2 from Michel Normand
http://bugzilla.suse.com/show_bug.cgi?id=974419
http://bugzilla.suse.com/show_bug.cgi?id=974419#c3
--- Comment #3 from Michel Normand
http://bugzilla.suse.com/show_bug.cgi?id=974419
http://bugzilla.suse.com/show_bug.cgi?id=974419#c4
--- Comment #4 from Michel Normand
http://bugzilla.suse.com/show_bug.cgi?id=974419
Dinar Valeev
http://bugzilla.suse.com/show_bug.cgi?id=974419
Chenzi Cao
http://bugzilla.suse.com/show_bug.cgi?id=974419
http://bugzilla.suse.com/show_bug.cgi?id=974419#c5
Michel Normand
pending tests with update version of gdb- orphanripper.c from http://pkgs.fedoraproject.org/cgit/rpms/gdb.git/log/gdb-orphanripper.c
This code change is not sufficient, but the hang condition seems to differ as an expect process hang: === ... 22391 pts/2 S+ 0:00 | \_ make check-parallel 22458 pts/2 S+ 0:00 | \_ /bin/sh -c make -k do-check-parallel; \ /bin/sh /home/abuild/rpmbuild/BUILD/gdb-7.10.1/gdb/testsuite/dg-extract-results.sh \ `find outputs -name gdb.sum -print` > gdb.sum; \ /bin/sh /home/abuild/rpmbuild/BUILD/gdb-7.10.1/gdb/testsuite/dg-extract-results.sh -L \ `find outputs -name gdb.log -print` > gdb.log 22476 pts/2 S+ 0:00 | \_ make -k do-check-parallel 23273 pts/2 S+ 0:00 | \_ /bin/sh -c rootme=`pwd`; export rootme; srcdir=/home/abuild/rpmbuild/BUILD/gdb-7.10.1/gdb/testsuite ; export srcdir ; EXPECT=`if [ "${READ1}" != "" ] ; then echo ${rootme}/expect-read1; elif [ -f ${rootme}/../../expect/expect ] ; then echo ${rootme}/../../expect/expect ; else echo expect ; fi` ; export EXPECT ; EXEEXT= ; export EXEEXT ; LD_LIBRARY_PATH=$rootme/../../expect:$rootme/../../libstdc++:$rootme/../../tk/unix:$rootme/../../tcl/unix:$rootme/../../bfd:$rootme/../../opcodes:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; if [ -f ${rootme}/../../expect/expect ] ; then TCL_LIBRARY=${srcdir}/../../tcl/library ; export TCL_LIBRARY ; fi ; runtest GDB_PARALLEL=yes --outdir=outputs/gdb.reverse/step-reverse gdb.reverse/step-reverse.exp --target_board=unix/-m32 23276 pts/2 S+ 0:00 | \_ expect -- /usr/share/dejagnu/runtest.exp GDB_PARALLEL=yes --outdir=outputs/gdb.reverse/step-reverse gdb.reverse/step-reverse.exp --target_board=unix/-m32 === Note that the problem initially identified on ppc64le archi is able to be recreated as per this trace on an x86_64 guest. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=974419
http://bugzilla.suse.com/show_bug.cgi?id=974419#c6
--- Comment #6 from Michel Normand
http://bugzilla.suse.com/show_bug.cgi?id=974419
http://bugzilla.suse.com/show_bug.cgi?id=974419#c8
--- Comment #8 from Michel Normand
I suppose this does not reproduce locally?
locally ? do you mean without osc ?
The log tells you the forked make process is still running and as orphanripper is invoked w/o a timeout it waits until that process terminates.
I'm quite sure this is a bug with the VM environment - what happens if you remove the use of orphanripper (not sure if the reason RH uses it also applies to us - we simply copied Fedoras .spec file).
I did a trial yesterday in a ppc64le guest of osc build without orphanripper in spec file, and the build hang on second trial, with same process tree as comment #5 and comment #6 with x86_64 guest (with orphanripper) === [michel@twppc64le:~/work/openSUSE:Factory:Rings:1-MinimalX/gdb] $idx=1;while test 1; do echo "=== trial $idx"; osc build >/tmp/x || break; ((idx++)); done === trial 1 cat: /proc/device-tree/ibm,partition-name: No such file or directory === trial 2 cat: /proc/device-tree/ibm,partition-name: No such file or directory === ps axf ... \_ make check-parallel \_ /bin/sh -c make -k do-check-parallel; \ /bin/sh /home/abuild/rpmbuild/BUILD/gdb-7.10.1/gdb/testsuite/dg-extract-results.sh \ `find outputs -name gdb.sum -print` > gdb.sum; \ /bin/sh /home/abuild/rpmbuild/BUILD/gdb-7.10.1/gdb/testsuite/dg-extract-results.sh -L \ `find outputs -name gdb.log -print` > gdb.log \_ make -k do-check-parallel \_ /bin/sh -c rootme=`pwd`; export rootme; srcdir=/home/abuild/rpmbuild/BUILD/gdb-7.10.1/gdb/testsuite ; export srcdir ; EXPECT=`if [ "${READ1}" != "" ] ; then echo ${rootme}/expect-read1; elif [ -f ${rootme}/../../expect/expect ] ; then echo ${rootme}/../../expect/expect ; else echo expect ; fi` ; export EXPECT ; EXEEXT= ; export EXEEXT ; LD_LIBRARY_PATH=$rootme/../../expect:$rootme/../../libstdc++:$rootme/../../tk/unix:$rootme/../../tcl/unix:$rootme/../../bfd:$rootme/../../opcodes:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH; if [ -f ${rootme}/../../expect/expect ] ; then TCL_LIBRARY=${srcdir}/../../tcl/library ; export TCL_LIBRARY ; fi ; runtest GDB_PARALLEL=yes --outdir=outputs/gdb.multi/multi-arch-exec gdb.multi/multi-arch-exec.exp --target_board=unix/-m64 \_ expect -- /usr/share/dejagnu/runtest.exp GDB_PARALLEL=yes --outdir=outputs/gdb.multi/multi-arch-exec gdb.multi/multi-arch-exec.exp --target_board=unix/-m64 -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com