Reproducible Builds in January 2023

View all our monthly reports


Welcome to the first report for 2023 from the Reproducible Builds project!

In these reports we try and outline the most important things that we have been up to over the past month, as well as the most important things in/around the community. As a quick recap, the motivation behind the reproducible builds effort is to ensure no malicious flaws can be deliberately introduced during compilation and distribution of the software that we run on our devices. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.


News

In a curious turn of events, GitHub first announced this month that the checksums of various Git archives may be subject to change, specifically that because:

… the default compression for Git archives has recently changed. As result, archives downloaded from GitHub may have different checksums even though the contents are completely unchanged.

This change (which was brought up on our mailing list last October) would have had quite wide-ranging implications for anyone wishing to validate and verify downloaded archives using cryptographic signatures. However, GitHub reversed this decision, updating their original announcement with a message that “We are reverting this change for now. More details to follow.” It appears that this was informed in part by an in-depth discussion in the GitHub Community issue tracker.


The Bundesamt für Sicherheit in der Informationstechnik (BSI) (trans: ‘The Federal Office for Information Security’) is the agency in charge of managing computer and communication security for the German federal government. They recently produced a report that touches on attacks on software supply-chains (Supply-Chain-Angriff). (German PDF)


Contributor Seb35 updated our website to fix broken links to Tails’ Git repository [][], and Holger updated a large number of pages around our recent summit in Venice [][][][].


Noak Jönsson has written an interesting paper entitled The State of Software Diversity in the Software Supply Chain of Ethereum Clients. As the paper outlines:

In this report, the software supply chains of the most popular Ethereum clients are cataloged and analyzed. The dependency graphs of Ethereum clients developed in Go, Rust, and Java, are studied. These client are Geth, Prysm, OpenEthereum, Lighthouse, Besu, and Teku. To do so, their dependency graphs are transformed into a unified format. Quantitative metrics are used to depict the software supply chain of the blockchain. The results show a clear difference in the size of the software supply chain required for the execution layer and consensus layer of Ethereum.


Yongkui Han posted to our mailing list discussing making reproducible builds & GitBOM work together without gitBOM-ID embedding. GitBOM (now renamed to OmniBOR) is a project to “enable automatic, verifiable artifact resolution across today’s diverse software supply-chains” []. In addition, Fabian Keil wrote to us asking whether anyone in the community would be at Chemnitz Linux Days 2023, which is due to take place on 11th and 12th March (event info).

Separate to this, Akihiro Suda posted to our mailing list just after the end of the month with a status report of bit-for-bit reproducible Docker/OCI images. As Akihiro mentions in their post, they will be giving a talk at FOSDEM in the ‘Containers’ devroom titled Bit-for-bit reproducible builds with Dockerfile and that “my talk will also mention how to pin the apt/dnf/apk/pacman packages with my repro-get tool.”


The extremely popular Signal messenger app added upstream support for the SOURCE_DATE_EPOCH environment variable this month. This means that release tarballs of the Signal desktop client do not embed nondeterministic release information. [][]



Distribution work

F-Droid & Android

There was a very large number of changes in the F-Droid and wider Android ecosystem this month:

On January 15th, a blog post entitled Towards a reproducible F-Droid was published on the F-Droid website, outlining the reasons why “F-Droid signs published APKs with its own keys” and how reproducible builds allow using upstream developers’ keys instead. In particular:

In response to […] criticisms, we started encouraging new apps to enable reproducible builds. It turns out that reproducible builds are not so difficult to achieve for many apps. In the past few months we’ve gotten many more reproducible apps in F-Droid than before. Currently we can’t highlight which apps are reproducible in the client, so maybe you haven’t noticed that there are many new apps signed with upstream developers’ keys.

(There was a discussion about this post on Hacker News.)

In addition:

Debian

As mentioned in last month’s report, Vagrant Cascadian has been organising a series of online sprints in order to ‘clear the huge backlog of reproducible builds patches submitted’ by performing NMUs (Non-Maintainer Uploads). During January, a sprint took place on the 10th, resulting in the following uploads:

During this sprint, Holger Levsen filed Debian bug #1028615 to request that the tracker.debian.org service display results of reproducible rebuilds, not just reproducible ‘CI’ results.

Elsewhere in Debian, strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month, version 1.13.1-1 was uploaded to Debian unstable by Holger Levsen, including a fix by FC Stegerman (obfusk) to update a regular expression for the latest version of file(1) []. (#1028892)

Lastly, 65 reviews of Debian packages were added, 21 were updated and 35 were removed this month adding to our knowledge about identified issues.

Other distributions

In other distributions:


diffoscope

diffoscope is our in-depth and content-aware diff utility. Not only can it locate and diagnose reproducibility issues, it can provide human-readable diffs from many kinds of binary formats. This month, Chris Lamb made the following changes to diffoscope, including preparing and uploading versions 231, 232, 233 and 234 to Debian:

  • No need for from __future__ import print_function import anymore. []
  • Comment and tidy the extras_require.json handling. []
  • Split inline Python code to generate test Recommends into a separate Python script. []
  • Update debian/tests/control after merging support for PyPDF support. []
  • Correctly catch segfaulting cd-iccdump binary. []
  • Drop some old debugging code. []
  • Allow ICC tests to (temporarily) fail. []

In addition, FC Stegerman (obfusk) made a number of changes, including:

  • Updating the test_text_proper_indentation test to support the latest version(s) of file(1). []
  • Use an extras_require.json file to store some build/release metadata, instead of accessing the internet. []
  • Updating an APK-related file(1) regular expression. []
  • On the diffoscope.org website, de-duplicate contributors by e-mail. []

Lastly, Sam James added support for PyPDF version 3 [] and Vagrant Cascadian updated a handful of tool references for GNU Guix. [][]


Upstream patches

The Reproducible Builds project attempts to fix as many currently-unreproducible packages as possible. This month, we wrote a large number of such patches, including:


Testing framework

The Reproducible Builds project operates a comprehensive testing framework at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In January, the following changes were made by Holger Levsen:

  • Node changes:

  • Debian-related changes:

    • Only keep diffoscope’s HTML output (ie. no .json or .txt) for LTS suites and older in order to save diskspace on the Jenkins host. []
    • Re-create pbuilder base less frequently for the stretch, bookworm and experimental suites. []
  • OpenWrt-related changes:

    • Add gcc-multilib to OPENWRT_HOST_PACKAGES and install it on the nodes that need it. []
    • Detect more problems in the health check when failing to build OpenWrt. []
  • Misc changes:

    • Update the chroot-run script to correctly manage /dev and /dev/pts. [][][]
    • Update the Jenkins ‘shell monitor’ script to collect disk stats less frequently [] and to include various directory stats. [][]
    • Update the ‘real’ year in the configuration in order to be able to detect whether a node is running in the future or not. []
    • Bump copyright years in the default page footer. []

In addition, Christian Marangi submitted a patch to build OpenWrt packages with the V=s flag to enable debugging. []


If you are interested in contributing to the Reproducible Builds project, please visit the Contribute page on our website. You can get in touch with us via:




View all our monthly reports