Welcome to the March 2022 report from the Reproducible Builds project! In our monthly reports we outline the most important things that we have been up to over the past month.
The in-toto project was accepted as an “incubating project” within the Cloud Native Computing Foundation (CNCF). in-toto is a framework that protects the software supply chain by collecting and verifying relevant data. It does so by enabling libraries to collect information about software supply chain actions and then allowing software users and/or project managers to publish policies about software supply chain practices that can be verified before deploying or installing software. CNCF foundations hosts a number of critical components of the global technology infrastructure under the auspices of the Linux Foundation. (View full announcement.)
Hervé Boutemy posted to our mailing list with an announcement that the Java Reproducible Central has hit the milestone of “500 fully reproduced builds of upstream projects”. Indeed, at the time of writing, according to the nightly rebuild results, 530 releases were found to be fully reproducible, with 100% reproducible artifacts.
GitBOM is relatively new project to enable build tools trace every source file that is incorporated into build artifacts. As an experiment and/or proof-of-concept, the GitBOM developers are rebuilding Debian to generate side-channel build metadata for versions of Debian that have already been released. This only works because Debian is (partial) reproducible, so one can be sure that that, if the case where build artifacts are identical, any metadata generated during these instrumented builds applies to the binaries that were built and released in the past. More information on their approach is available in README
file in the bomsh repository.
Ludovic Courtes has published an academic paper discussing how the performance requirements of high-performance computing are not (as usually assumed) at odds with reproducible builds. The received wisdom is that vendor-specific libraries and platform-specific CPU extensions have resulted in a culture of local recompilation to ensure the best performance, rendering the property of reproducibility unobtainable or even meaningless. In his paper, Ludovic explains how Guix has:
[…] implemented what we call “package multi-versioning” for C/C++ software that lacks function multi-versioning and run-time dispatch […]. It is another way to ensure that users do not have to trade reproducibility for performance. (full PDF)
Kit Martin posted to the FOSSA blog a post titled The Three Pillars of Reproducible Builds. Inspired by the “shock of infiltrated or intentionally broken NPM packages, supply chain attacks, long-unnoticed backdoors”, the post goes on to outline the high-level steps that lead to a reproducible build:
It is one thing to talk about reproducible builds and how they strengthen software supply chain security, but it’s quite another to effectively configure a reproducible build. Concrete steps for specific languages are a far larger topic than can be covered in a single blog post, but today we’ll be talking about some guiding principles when designing reproducible builds. […]
The article was discussed on Hacker News.
Finally, Bernhard M. Wiedemann noticed that the GNU Helloworld project varies depending on whether it is being built during a full moon! (Reddit announcement, openSUSE bug report)
Events
There will be an in-person “Debian Reunion” in Hamburg, Germany later this year, taking place from 23 — 30 May. Although this is a “Debian” event, there will be some folks from the broader Reproducible Builds community and, of course, everyone is welcome. Please see the event page on the Debian wiki for more information.
Bernhard M. Wiedemann posted to our mailing list about a meetup for Reproducible Builds folks at the openSUSE conference in Nuremberg, Germany.
It was also recently announced that DebConf22 will take place this year as an in-person conference in Prizren, Kosovo. The pre-conference meeting (or “Debcamp”) will take place from 10 — 16 July, and the main talks, workshops, etc. will take place from 17 — 24 July.
Misc news
Holger Levsen updated the Reproducible Builds website to improve the documentation for the SOURCE_DATE_EPOCH
environment variable, both by expanding parts of the existing text […][…] as well as clarifying meaning by removing text in other places […]. In addition, Chris Lamb added a Twitter Card to our website’s metadata too […][…][…].
On our mailing list this month:
-
Early in the month, Mattia Rizzolo posted to our mailing list asking for early thoughts about running an in-person Reproducible Builds event later in 2022.
-
Chris Lamb then posted to our mailing list with a call for “real-world instances” where reproducibility practices have flagged something legitimately “bad”. Although no public, concrete examples were cited, the resulting discussion was interesting and wide-ranging.
-
Marc Haber also posted to our mailing list with an interesting problem where building Debian packages with libfaketime yields different results when run outside of libfaketime.
Distribution work
In Debian this month:
- Johannes Schauer Marin Rodrigues posted to the debian-devel list mentioning that he exploited the property of reproducibility within Debian to demonstrate that automatically converting a large number of packages to a new internal “source version” did not change the resulting packages. The proposed change could therefore be applied without causing breakage:
So now we have 364 source packages for which we have a patch and for which we can show that this patch does not change the build output. Do you agree that with those two properties, the advantages of the 3.0 (quilt) format are sufficient such that the change shall be implemented at least for those 364? […]
-
144 reviews of Debian packages were added, 241 were updated and 20 were removed this month, significantly adding to our knowledge about identified issues. A number of issue types were updated too, including
buildpath_in_postgres_opcodes
,captures_kernel_version_via_CMAKE_SYSTEM
,build_id_differences_only
, etc. -
Lukas Puehringer updated both the
python-securesystemslib
package to version0.22.0-1
and thein-toto
package to1.2.0-1
.
In openSUSE, Bernhard M. Wiedemann posted his usual monthly reproducible builds status report.
Tooling
diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb prepared and uploaded versions 207
, 208
and 209
to Debian unstable, as well as made the following changes to the code itself:
-
Update minimum version of Black to prevent test failure on Ubuntu jammy. […]
-
Updated the R test fixture for the 4.2.x series of the R programming language. […]
Brent Spillner also worked on adding graceful handling for UNIX sockets and named pipes to diffoscope. […][…][…]. Vagrant Cascadian also updated the diffoscope package in GNU Guix. […][…]
reprotest is the Reproducible Build’s project end-user tool to build the same source code twice in widely different environments and checking whether the binaries produced by the builds have any differences. This month, Santiago Ruano Rincón added a new --append-build-command
option […], which was subsequently uploaded to Debian unstable by Holger Levsen.
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
-
Bernhard M. Wiedemann:
-
Huong Nguyenthi:
-
Chris Lamb:
- #1007232 filed against
python-ara
. - #1007757 filed against
nbformat
. - #1007760 filed against
chemical-structures
. - #1007908 filed against
fiat
.
- #1007232 filed against
-
Vagrant Cascadian:
- #1006844 filed against
intel-mediasdk
. - #1006858 filed against
libao
. - #1006860 & #1006861 filed against
pcp
. - #1006863 filed against
tevent
. - #1006864 filed against
pcp
. - #1006865 filed against
apr-util
. - #1006979 filed against
liggghts
. - #1007094 filed against
kristall
. - #1007095 filed against
lmod
. - #1007137 filed against
libranlip
. - #1007184 filed against
xrt
. - #1007185 filed against
btrfsmaintenance
.
- #1006844 filed against
Testing framework
The Reproducible Builds project runs a significant testing framework at tests.reproducible-builds.org, to check packages and other artifacts for reproducibility. This month, the following changes were made:
-
Holger Levsen:
- Replace a local copy of the
dsa-check-running-kernel
script with a packaged version. […] - Don’t hide the status of offline hosts in the Jenkins shell monitor. […]
- Detect undefined service problems in the node health check. […]
- Update the
sources.lst
file for our mail server as its still running Debian buster. […] - Add our mail server to our node inventory so it is included in the Jenkins maintenance processes. […]
- Remove the
debsecan
package everywhere; it got installed accidentally via theRecommends
relation. […] - Document the usage of the osuosl174 host. […]
- Replace a local copy of the
Regular node maintenance was also performed by Holger Levsen […], Vagrant Cascadian […][…][…] and Mattia Rizzolo.
If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:
-
IRC:
#reproducible-builds
onirc.oftc.net
. -
Twitter: @ReproBuilds
-
Mailing list:
[email protected]