[Bug 1040589] New: bash/gcc/gzip/python differ between builds because of profiling
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589 Bug ID: 1040589 Summary: bash/gcc/gzip/python differ between builds because of profiling Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: openSUSE 13.2 Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: rguenther@suse.com Reporter: bwiedemann@suse.com QA Contact: qa-bugs@suse.de CC: matz@suse.com Found By: Development Blocker: --- In https://build.opensuse.org/project/prjconf/openSUSE:Factory we have %do_profiling 1 and because of that in our bash.spec we enable gcc's 'profile feedback directed optimizations' but that causes the jobs.o and resulting bash binary to differ between builds, even when running on the same build host. And because of that, build-compare always thinks there is a change and triggers a re-publish and rebuild of depending packages We also have such binary diffs in gcc6 and gcc6.spec calls a make profiledbootstrap The do_profiling macro is used in bash gzip hello python3-base python-base sed xz and in http://rb.zq1.de/compare.factory-20170523/bash-compare.out http://rb.zq1.de/compare.factory-20170523/gcc6-compare.out http://rb.zq1.de/compare.factory-20170523/gzip-compare.out http://rb.zq1.de/compare.factory-20170523/python-base-compare.out http://rb.zq1.de/compare.factory-20170523/python3-base-compare.out we have strange diffs in assembler that I could not trace down to other sources of non-determinism until today. Diffs did go away when building without profiling (it was harder to disable for gcc6 and bash though) Do the profiles just count invocations of functions or do they depend on the type and speed of the system? In the first case, it should be possible to fix the profiling runs to be deterministic, but for that it would be useful to be able to see the differences between runs. How could I diff gcc's .gcda files? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c1
--- Comment #1 from Dr. Werner Fink
In https://build.opensuse.org/project/prjconf/openSUSE:Factory we have %do_profiling 1
and because of that in our bash.spec we enable gcc's 'profile feedback directed optimizations'
but that causes the jobs.o and resulting bash binary to differ between builds, even when running on the same build host.
And because of that, build-compare always thinks there is a change and triggers a re-publish and rebuild of depending packages
We also have such binary diffs in gcc6 and gcc6.spec calls a make profiledbootstrap
The do_profiling macro is used in bash gzip hello python3-base python-base sed xz
and in http://rb.zq1.de/compare.factory-20170523/bash-compare.out http://rb.zq1.de/compare.factory-20170523/gcc6-compare.out http://rb.zq1.de/compare.factory-20170523/gzip-compare.out http://rb.zq1.de/compare.factory-20170523/python-base-compare.out http://rb.zq1.de/compare.factory-20170523/python3-base-compare.out
we have strange diffs in assembler that I could not trace down to other sources of non-determinism until today. Diffs did go away when building without profiling (it was harder to disable for gcc6 and bash though)
Do the profiles just count invocations of functions or do they depend on the type and speed of the system?
In the first case, it should be possible to fix the profiling runs to be deterministic, but for that it would be useful to be able to see the differences between runs. How could I diff gcc's .gcda files?
That is you problem, no mine ... do not touch bash test suite! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
Bernhard Wiedemann
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c6
--- Comment #6 from Bernhard Wiedemann
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c9
Martin Liška
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c10
Martin Liška
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c11
Martin Liška
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c43
Bernhard Wiedemann
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c44
--- Comment #44 from Andreas Stieger
bison-3.7.3 started to vary from PGO.
Can you help me understand if this is a (new) problem in 3.7.3, and give the steps to compare the binaries? Last I found was: http://rb.zq1.de/compare.factory-20200430/bison-compare.out -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c45
--- Comment #45 from Bernhard Wiedemann
PGO again via ASLR, filesys, date
https://build.opensuse.org/request/show/676711 made it reproducible back then and looking closer at old results, 3.5.2 was the first one marked to differ https://rb.zq1.de/compare.factory-20200303/bison-compare.out Here is the general description of the problem of PGO with reproducibility: https://github.com/bmwiedemann/theunreproduciblepackage/tree/master/pgo Would disabling PGO for bison be an option? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c46
--- Comment #46 from Martin Pluskal
I found my debugging note from/before 2020-06-29:
PGO again via ASLR, filesys, date
https://build.opensuse.org/request/show/676711 made it reproducible back then and looking closer at old results, 3.5.2 was the first one marked to differ https://rb.zq1.de/compare.factory-20200303/bison-compare.out
Here is the general description of the problem of PGO with reproducibility: https://github.com/bmwiedemann/theunreproduciblepackage/tree/master/pgo
Would disabling PGO for bison be an option?
No! I do not wish to damage speed and/or optimized size of binaries because of this. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c47
--- Comment #47 from Martin Li��ka
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c48
--- Comment #48 from Bernhard Wiedemann
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589#c57
--- Comment #57 from OBSbugzilla Bot
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589
Bernhard Wiedemann
http://bugzilla.opensuse.org/show_bug.cgi?id=1040589 Bug 1040589 depends on bug 1197575, which changed state. Bug 1197575 Summary: hello varies from PGO and date http://bugzilla.opensuse.org/show_bug.cgi?id=1197575 What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are on the CC list for the bug.
participants (2)
-
bugzilla_noreply@novell.com
-
bugzilla_noreply@suse.com