Michael Matz wrote:
On Fri, 11 Dec 2015, Adam Spiers wrote:
Is anyone working on (or thinking of working on) making our build process reproducible?
We have that since about forever as far as easily possible. The hard part is changing packages to not depend on things like build time (e.g. encoding build date/time into strings into executables). That's not something you can do generally in a build system, but must be changed in each and every individual package.
It's probably worth to explain the difference between our reproducible builds and this new interpretation. SUSE and openSUSE distributions have always had reproducible builds, for something like 20 years now. Reproducible in the sense that a packager never builds binaries on his own system in some magic way and then uploads binaries. […]
Thanks Ludwig for your explanations as we unfortunately use some common words to convey different ideas. I want to try to elaborate on them so we can better colaborate between projects as we—in Tor, Debian and many other projects—strongly believe that these issues concern all free software projects.
I believe we have trouble understanding one another because, from my understanding, we have been working on solving different problems:
The problem openSUSE has solved with OBS is “Can a software be rebuilt from source?”
The problem Bitcoin and Tor Browser have solved is “Can users verify that the binary has been built using the source distributed with it?” That's the problem which prompted efforts in Debian and other projects, and which lead to the creation of reproducible-builds.org.
From the point of view of an openSUSE developper, it might look the same. As only source code can be submitted to OBS, as long as one trust the infrastructure, they can be confident than the binary produced by OBS is coming from the source code that lands in the archive.
But end users might want to verify by themselves that binaries match a source code they can audit. When a build process doesn't produce bit-for-bit identical results, this become much harder.
Having the ability to have third parties assess binary packages also relieve the infrastructure maintainers from potential pressures. It's easier to resist by saying “if we change that binary, someone is going to notice”.
The mechanism we have in place to detect unchanged binary rpms (build-compare) only partially gets us there as it solves a different problem. build-compare is basically a heuristic to avoid triggering rebuilds of other packages and publishing of packages with only trivial changes (like time stamps). It does that by "normalizing" some content and throwing away results with trivial changes. It does not necessarily prevent trivial changes. Some of the measurements to please build-compare are also useful for bit identical results though, like eliminating time stamps in build results in the first place.
Speaking of build-compare, the tool that we built to debug differences between two builds, diffoscope, is getting wider exposure and becoming quite versatile. It's in Python and designed to be as modular as possible. The hope was to have something more maintainable and extensible than a single shell script.
One feature diffoscope is missing to become a proper alternative to build-compare is the ability to filter differences. It's on my list of things to implement, and I'll eventually get to it, but contributions would be great if anyone is interested.
In any cases, don't hesitate to get in touch. We, people behind Debian reproducible builds efforts and reproducible-builds.org, want to help! :)