763822">

Debian Bug report logs - #763822
ftp.debian.org: please include .buildinfo file in the archive

Package: ftp.debian.org; Maintainer for ftp.debian.org is Debian FTP Master <ftpmaster@ftp-master.debian.org>;

Reported by: Jérémy Bobbio <lunar@debian.org>

Date: Thu, 2 Oct 2014 21:42:01 UTC

Severity: normal

Reply or subscribe to this bug.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to debian-bugs-dist@lists.debian.org, reproducible-builds@lists.alioth.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Thu, 02 Oct 2014 21:42:06 GMT) (full text, mbox, link).


Acknowledgement sent to Jérémy Bobbio <lunar@debian.org>:
New Bug report received and forwarded. Copy sent to reproducible-builds@lists.alioth.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Thu, 02 Oct 2014 21:42:06 GMT) (full text, mbox, link).


Message #5 received at submit@bugs.debian.org (full text, mbox, reply):

From: Jérémy Bobbio <lunar@debian.org>
To: submit@bugs.debian.org
Subject: ftp.debian.org: please include .buildinfo file in the archive
Date: Thu, 2 Oct 2014 23:39:10 +0200
[Message part 1 (text/plain, inline)]
Package: ftp.debian.org
Severity: wishlist
User: ftp.debian.org@packages.debian.org
Usertags: archive

Hi!

As part of the “reproducible builds” effort [1], we came up with the
idea of a new control file, currently named “.buildinfo”

.buildinfo files would capture from the build environment as much
information as needed to reproduce the build. The file format and how it
could be included in the archive is described on the wiki [2].

An exercise in summarizing the key points:

 * A .buildinfo file is generated for each build, and is
   considered unique for a source package, version, and architecture.
   A rebuild should always generate the same .buildinfo as
   the original build.
 * A .buildinfo contains many fields similar to .changes, and the list
   of all packages forming the build environment (build-deps, their
   deps, Essential:yes, build-essential, etc.)
 * .buildinfo would be distributed in the archive together with source
   and binary packages.
 * They would be accompanied by detached GnuPG signatures, so multiple
   parties (e.g. DD and buildd) could assert the production of similar
   binary packages from the same source and same environment.
 * The latter information can then be shown in the Packages index
   for each binary packages.
 * A tool will allow independent parties to rebuild binary packages
   from .buildinfo files in the archive.

We wish .buildinfo files could become part of the Debian archive. For
that to happen, we highly welcome your comments on the current
specification, and advices regarding the next steps we could make.

During our experiments, adding .buildinfo files to .changes had one
unforeseen consequence. Packages that used to be “Architecture: all” are
now “Architecture: all amd64” as .buildinfo are tied to a given build
architecture. Except that it breaks lintian test suite, it is unclear if
that's a problem at all, or if some changes should be made. Again, your
input would be most welcome.

 [1]: https://wiki.debian.org/ReproducibleBuilds
 [2]: https://wiki.debian.org/ReproducibleBuilds/BuildinfoSpecification

-- 
Lunar                                .''`. 
lunar@debian.org                    : :Ⓐ  :  # apt-get install anarchism
                                    `. `'` 
                                      `-   
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Fri, 04 Dec 2015 01:03:07 GMT) (full text, mbox, link).


Acknowledgement sent to Daniel Kahn Gillmor <dkg@fifthhorseman.net>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Fri, 04 Dec 2015 01:03:07 GMT) (full text, mbox, link).


Message #10 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: Jérémy Bobbio <lunar@debian.org>, 763822@bugs.debian.org, Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>, jelmer@debian.org, Niels Thykier <niels@thykier.net>
Subject: Re: [Reproducible-builds] Bug#763822: ftp.debian.org: please include .buildinfo file in the archive
Date: Fri, 04 Dec 2015 02:59:15 +0200
[Message part 1 (text/plain, inline)]
Hi there!

In https://bugs.debian.org/763822, lunar asked ftp.debian.org to accept
.buildinfo files when they are uploaded with a .changes file.

This is a followup to make the request concrete by specifying how we
hope the archive will sanity-check the included .buildinfo files, and
with a suggestion of how they could be distributed across the mirrors in
a way that will be reasonably convenient for users and downstreams
without making mirror operators crazy.

I'm writing this after discussions with Jelmer, Niels, Lunar, and others
involved in the Reproducible Builds project.

Constraints guiding the suggestions below
-----------------------------------------

We want an archive user to be able to find and fetch all .buildinfo
files that produced a given binary package

We want the eventual possibility of multiple .buildinfo files per
<srcpkg,version,arch>

We understsand that mirror operators don't like small files because
rsync gets fussy with them.

We want both buildds and debian developers to be able to upload
.buildinfo files.


Asks of ftp-master
------------------

We hope that the archive will verify .buildinfo files uploaded by
buildds and DDs or DMs.  We don't expect to require buildds or DDs or
DMs to upload .buildinfo files at this time, though we hope they'll
start to do so once the archive can accept them.

Here's how we think the archive might sanity-check them:

   * There may be 0 or more .buildinfo files included in a .changes
     file.  Each .buildinfo file describes an environment that was used
     to produce some of the binary artifacts (e.g. .deb, .udeb, etc) in
     this upload.

   * To validate each .buildinfo file:

     * ensure that the filename is of the form
       <srcpkgname>_<version>_<arbstring>.buildinfo where:
         * <srcpkgname> matches the source name in the Source: field
         * <version> equals the Version: field
         * <arbstring> is /[-a-z0-9]+/

     * ensure that this filename is not already in the archive.

     * the file should be clearsigned OpenPGP in UTF-8, with nothing
       outside the OpenPGP framing.  It should have a valid signature
       from the same OpenPGP key that signed the .changes file.

     * The signed part of the file must be a valid control file.

     * ensure that Source: and Version: fields in the .buildinfo
       matches the Source: and Version: fields in the .changes file.

     * the .buildinfo must include the same .dsc as the .changes file,
       with the same checksum.

     * in addition, the .buildinfo file should list at least one binary
       artifact.

     * for every binary artifact listed in the .buildinfo file:

        * ensure that it is listed in the .changes file with the same
          checksum(s).  (fwiw, if anyone is concerned about minimizing
          the size of the .buildinfo file, there is no need to include
          the md5 or sha1 checksums of the artifacts. The
          Checksums-Sha256: sub-stanza is sufficient for our purposes)

    If an included .buildinfo file doesn't validate, please reject the
    entire upload.


Once an upload that includes some .buildinfo files is accepted, we want
users to be able to find the .buildinfo(s) for a binary package if they
want to try to reproduce it.

Here's a concrete suggestion for how to do that in a way that might not
make mirror operators sad (if you have a different or preferred
suggestion, that'd be great too):

* collect all .buildinfo files in the archive that produced binary
  artifacts for a given architecture in a tarball named Buildinfos.tgz
  which is distributed alongside Packages.  (for example,
  binary-amd64/Buildinfos.tgz and binary-all/Buildinfos.tgz).

* the structure of the tarball could be
  <srcpkgname>/<version>/<srcpkgname>_<version>_<arbstring>.buildinfo
  (with the same expansions as above)

* the gzip layer of the tarball should be --rsyncable

* When re-creating the Buildinfos.tgz file after some binary artifacts
  have been removed from a suite, any .buildinfo file that only
  references artifacts no longer in the suite can be removed.

Does this seem like a reasonable way to distribute this information?


Clarifications from original bug report
---------------------------------------

In the time since this bug was originally posted, we have a clearer
understanding of what a .buildinfo file represents, and how it can be
used.  For clarity and future documentation, i'll amend/nitpick a few
comments from the original text of the bug report below:

> .buildinfo files would capture from the build environment as much
> information as needed to reproduce the build.

The .buildinfo file *may* contain more information than is needed to
reproduce the build.  The goal is to have it provide enough of a record
of the build environment to be able to reproduce the build, but it's
also possible to include additional, unneeded information, and that's
OK.  (e.g. we are likely to include the exact version number of some
essential packages, even though going from coreutils 8.17-1 to 8.17-2 is
unlikely to affect the build).

>  * A .buildinfo file is generated for each build, and is
>    considered unique for a source package, version, and architecture.
>    A rebuild should always generate the same .buildinfo as
>    the original build.

We no longer think it will need to be unique for this tuple, since it's
possible that multiple different build environments could produce the
same binary artifacts.  A rebuild might therefore produce a different
.buildinfo file, depending on the state of the archive, even if the
produced binary artifacts are identical.

>  * They would be accompanied by detached GnuPG signatures, so multiple
>    parties (e.g. DD and buildd) could assert the production of similar
>    binary packages from the same source and same environment.

We now think that any signatures should be specific to a single
.buildinfo file.

>  * The latter information can then be shown in the Packages index
>    for each binary packages.

We don't think we need to include any reference to the .buildinfo files
in the Packages index.

> During our experiments, adding .buildinfo files to .changes had one
> unforeseen consequence. Packages that used to be “Architecture: all” are
> now “Architecture: all amd64” as .buildinfo are tied to a given build
> architecture. Except that it breaks lintian test suite, it is unclear if
> that's a problem at all, or if some changes should be made. Again, your
> input would be most welcome.

We don't think this is relevant any longer, since the buildinfo file
isn't necessarily tied to the architecture of the build host.  (it may
record the build architecture, but the .buildinfo only really describes
the binary artifacts).


Regards,

        --dkg
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Mon, 14 Dec 2015 10:36:03 GMT) (full text, mbox, link).


Acknowledgement sent to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Mon, 14 Dec 2015 10:36:03 GMT) (full text, mbox, link).


Message #15 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Holger Levsen <holger@layer-acht.org>
To: 763822@bugs.debian.org
Cc: reproducible-builds@lists.alioth.debian.org, debian-dpkg@lists.debian.org
Subject: simple next step for getting .buildinfo files into Debian
Date: Mon, 14 Dec 2015 11:32:39 +0100
[Message part 1 (text/plain, inline)]
Hi ftp folks,

while we still appreciate your comments on this proposal as last week 
described in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763822 I'd like 
to make a intermediate very simple proposal, so that reproducible builds in 
Debian get one step forward:

- modify dak, so that it will not rejects uploads with a .buildinfo file 
included. 
- still, for now, throw the .buildinfo file immediatly away.
- only do this for experimental at the beginning. (maybe this restriction is 
not even needed/useful.)

That's it.

This would allow the dpkg maintainers to enable .buildinfo file creation, at 
least for builds for experimental.

What do you think?

As I see it, this should be a rather trivial code change for dak and yet bring 
us forward quite enourmously. Also it should be rather uncontroversial as we 
all agreed in Heidelberg at DebConf15 that we want .buildinfo files in Debian… 


cheers,
	Holger

[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Mon, 14 Dec 2015 21:33:03 GMT) (full text, mbox, link).


Acknowledgement sent to Niels Thykier <niels@thykier.net>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Mon, 14 Dec 2015 21:33:03 GMT) (full text, mbox, link).


Message #20 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Niels Thykier <niels@thykier.net>
To: Holger Levsen <holger@layer-acht.org>, 763822@bugs.debian.org
Cc: reproducible-builds@lists.alioth.debian.org, debian-dpkg@lists.debian.org
Subject: Re: simple next step for getting .buildinfo files into Debian
Date: Mon, 14 Dec 2015 21:14:55 +0000
[Message part 1 (text/plain, inline)]
Holger Levsen:
> Hi ftp folks,
> 

Hi,

> while we still appreciate your comments on this proposal as last week 
> described in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763822 I'd like 
> to make a intermediate very simple proposal, so that reproducible builds in 
> Debian get one step forward:
> 

I have started a git branch, build-info-support, available from:

* ssh://release.debian.org/~nthykier/dak

I realise that not every one have access to that machine, so the patches
are also attached (@FTP: The branch have signed commits, so you may
prefer merging form that).

> - modify dak, so that it will not rejects uploads with a .buildinfo file 
> included. 

I got patches to have dak accept these and do some trivial validation
(but not every validation proposed).  I will extend my branch as time
permits with additional checks.

> - still, for now, throw the .buildinfo file immediately away.

I have assumed this happens if you do no nothing explicitly with the
file after it being accepted.

 * @FTP: If not, please let me know how I can have dak discard the file.

> - only do this for experimental at the beginning. (maybe this restriction is 
> not even needed/useful.)
> 

 * Given the file is discarded, I have not added any such restrictions
   in my patch series.

> That's it.
> 
> This would allow the dpkg maintainers to enable .buildinfo file creation, at 
> least for builds for experimental.
> 
> What do you think?
> 

FWIW, I agree. :)

> As I see it, this should be a rather trivial code change for dak and yet bring 
> us forward quite enourmously. Also it should be rather uncontroversial as we 
> all agreed in Heidelberg at DebConf15 that we want .buildinfo files in Debian… 
> 
> 
> cheers,
> 	Holger
> 


Thanks,
~Niels


[0001-daklib-upload.py-Silently-accept-and-discard-.buildi.patch (text/x-patch, attachment)]
[0002-Do-very-basic-validation-of-.buildinfo-files.patch (text/x-patch, attachment)]
[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Tue, 15 Dec 2015 07:12:08 GMT) (full text, mbox, link).


Acknowledgement sent to Niels Thykier <niels@thykier.net>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Tue, 15 Dec 2015 07:12:08 GMT) (full text, mbox, link).


Message #25 received at 763822@bugs.debian.org (full text, mbox, been a ".buildinfo" in regexes.py. I have rebased the patches (and > resigned them) on to master. The changed patch is also attached. > > Thanks, > ~Niels > > &In-Reply-To=<566FBCA5.6010906@thykier.net>&subject=Re: simple next step for getting .buildinfo files into Debian">reply):

From: Niels Thykier <niels@thykier.net>
To: Holger Levsen <holger@layer-acht.org>, 763822@bugs.debian.org
Cc: reproducible-builds@lists.alioth.debian.org, debian-dpkg@lists.debian.org
Subject: Re: simple next step for getting .buildinfo files into Debian
Date: Tue, 15 Dec 2015 07:09:25 +0000
[Message part 1 (text/plain, inline)]
Niels Thykier:
> Holger Levsen:
>> Hi ftp folks,
>>
> 
> Hi,
> 
>> while we still appreciate your comments on this proposal as last week 
>> described in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763822 I'd like 
>> to make a intermediate very simple proposal, so that reproducible builds in 
>> Debian get one step forward:
>>
> 
> I have started a git branch, build-info-support, available from:
> 
> * ssh://release.debian.org/~nthykier/dak
> 
> [...]
> 
> Thanks,
> ~Niels
> 
> 

Hi,

Thanks to h01ger and jwilk, who spotted a ".changes" that should have
been a ".buildinfo" in regexes.py.  I have rebased the patches (and
resigned them) on to master.  The changed patch is also attached.

Thanks,
~Niels


[0001-daklib-upload.py-Silently-accept-and-discard-.buildi.patch (text/x-patch, attachment)]
[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Mon, 25 Jul 2016 20:30:20 GMT) (full text, mbox, link).


Acknowledgement sent to Jonathan McDowell <noodles@earth.li>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Mon, 25 Jul 2016 20:30:20 GMT) (full text, mbox, link).


Message #30 received at 763822@bugs.debian.org (full text, mbox, signatures would provide a basic level of attestation of chain of build > information. > > The rest of this mail continues on the above assumptions. If you do not > agree with the above the below is probably null and void, so ignore it > and instead educate me about what the requirements are and I'll try and > adjust my ideas based on that. > > So. If a single Buildinfo.xz file is acceptable, with the attestation > being elsewhere, I think this is doable without too much hackery in dak. > There are some trade-offs to make though, and I need to check which are > acceptable and which are viewed as too much. > > Firstly, there is currently no concept of "build ids" that I can see; > essentially the primary key for a build is (source-package, > architecture, version). This assumes we never have the same version of a > package with different binaries produced; I understand there is > sometimes skew between security + the main archive but it's not clear to > me if this will continue to be the case when we're doing things > reproducibly. Even if it's not adding a simple build id doesn't actually > help AFAICT. > > Secondly, buildinfo files that I've seen so far include arch all .debs > with the architecture .debs. I believe on the archive these should be > separate; so a build + upload that includes arch all + arch amd64 (for > example) debs will actually end up with an entry (for just the all debs) > in the all Buildinfo.xz and an entry (for just the amd64 debs) in the > amd64 Buildinfo.xz. Why? Binary NMUs, which don't rebuild the all .debs. > Otherwise you end up changing the buildinfo information (to drop the > rebuild amd64 debs) or keeping around old buildinfo information (+ you > have to track the fact you need it and know when to clean it up). > &In-Reply-To=<20160725202939.GM19933@earth.li>">reply):

From: Jonathan McDowell <noodles@earth.li>
To: 763822@bugs.debian.org
Cc: reproducible-builds@lists.alioth.debian.org
Subject: Moving towards buildinfo on the archive network
Date: Mon, 25 Jul 2016 21:29:39 +0100
Having been impressed by the current status of reproducible builds and
the fact it looks like we're close to having the important pieces in
Debian proper, I have started to have a look at how I could help out
with this bug. I've done some poking around in the dak code, and think I
have a vague idea of how to achieve what I think is wanted.

First, it is helpful to describe what I think is wanted. What I think we
need is the archive network to have, alongside the binary packages it
contains, details of exactly how to build those binaries. This is, I
believe, the information contained in the .buildinfo files.

This bug has previously talked about a tarball of .buildinfo files,
presented as Buildinfos.tgz alongside the Packages file. From looking at
the current architecture of dak I do not believe that this is an easy
option.

I propose instead a Buildinfo.xz (or gz or whatever) file, which is
single text file with containing all of the buildinfo information that
corresponds to the Packages list. What is lost by this approach are the
OpenPGP signatures that .buildinfo files can have on them. I appreciate
this is an important part of the reproducible builds aim, but I believe
one of its strengths is the ability for multiple separate package builds
to attest that they have used that buildinfo information to build the
exact same set of binary artefacts. This is not something that easily
scales on the archive network and I think it is better served by a
separate service; it would be possible to take the package snippet from
the buildinfo file and sign that alone, uploading the signature to the
attestation service. For "normal" Debian operation the usual archive
signatures would provide a basic level of attestation of chain of build
information.

The rest of this mail continues on the above assumptions. If you do not
agree with the above the below is probably null and void, so ignore it
and instead educate me about what the requirements are and I'll try and
adjust my ideas based on that.

So. If a single Buildinfo.xz file is acceptable, with the attestation
being elsewhere, I think this is doable without too much hackery in dak.
There are some trade-offs to make though, and I need to check which are
acceptable and which are viewed as too much.

Firstly, there is currently no concept of "build ids" that I can see;
essentially the primary key for a build is (source-package,
architecture, version). This assumes we never have the same version of a
package with different binaries produced; I understand there is
sometimes skew between security + the main archive but it's not clear to
me if this will continue to be the case when we're doing things
reproducibly. Even if it's not adding a simple build id doesn't actually
help AFAICT.

Secondly, buildinfo files that I've seen so far include arch all .debs
with the architecture .debs. I believe on the archive these should be
separate; so a build + upload that includes arch all + arch amd64 (for
example) debs will actually end up with an entry (for just the all debs)
in the all Buildinfo.xz and an entry (for just the amd64 debs) in the
amd64 Buildinfo.xz. Why? Binary NMUs, which don't rebuild the all .debs.
Otherwise you end up changing the buildinfo information (to drop the
rebuild amd64 debs) or keeping around old buildinfo information (+ you
have to track the fact you need it and know when to clean it up).

Thirdly, as the information is generated from a database, there needs to
be a defined order in which the fields are generated. This is purely to
ensure that the buildinfo information for each package is generated in a
reproducible fashion so any external signatures remain valid over time.

If these are acceptable I think that projectb needs 2 additional tables,
buildinfo_keys, similar to metadata_keys, and binaries_buildinfo, which
would have a 3 column primary key of (source-package, architecture,
version), and then key_id/value fields (similar to binaries_metadata) to
hold the buildinfo information that is not already present elsewhere in
the database. At present the main information these will hold is
Installed-Build-Depends field - the rest that I've actively seen are
available already.

Have I missed anything? I don't think the code to implement the above
ends up particularly complex in dak, and the resulting Buildinfo.xz
files should not add a particularly large amount of new data to the
mirror network. The main loss is that of the attestation information as
part of the mirror network (and actually, I can see a way we could add
that as a buildinfo field that wasn't part of the signature at some
point in the future).

(Additionally it is not clear to me where the dpkg status for
buildinfo creation is; I have heard that it's close to happening, but I
can't find anything on recent list archives about it - pointers
appreciated!)

J.

-- 
/-\                             |  I get the feeling that I've been
|@/  Debian GNU/Linux Developer |              cheated.
\-                              |



Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Wed, 03 Aug 2016 04:09:04 GMT) (full text, mbox, link).


Acknowledgement sent to Vagrant Cascadian <vagrant@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Wed, 03 Aug 2016 04:09:04 GMT) (full text, mbox, link).


Message #35 received at 763822@bugs.debian.org (full text, mbox, > signatures would provide a basic level of attestation of chain of build > > information. > > > > The rest of this mail continues on the above assumptions. If you do not > > agree with the above the below is probably null and void, so ignore it > > and instead educate me about what the requirements are and I'll try and > > adjust my ideas based on that. > > > > So. If a single Buildinfo.xz file is acceptable, with the attestation > > being elsewhere, I think this is doable without too much hackery in dak. > > There are some trade-offs to make though, and I need to check which are > > acceptable and which are viewed as too much. > > I just wanted to give a huge thanks for taking a good look at this, even > if it isn't exactly what has been specced out by earlier > reproducible-builds discussions. Evaluating a somewhat different > approach, especially if it turns out to be more feasible (at least from > some angles), is really valuable in my eyes. > > FWIW, I wasnt involved in the discussions spelling out what the > reproducible builds projects wanted in the archive, so I don't have much > concrete to say, but you've clearly given some serious thought and > effort to this, so I didn't want it to slip through the cracks! > > I tried to read through some of the documentation I could find: > > https://wiki.debian.org/ReproducibleBuilds/BuildinfoSpecification > https://reproducible-builds.org/events/athens2015/debian-buildinfo-review/ > https://reproducible-builds.org/events/athens2015/buildinfo-content/ > > Having reviewed the above, there doesn't seem to be a huge conflict that > you haven't at least considered already. > > Hopefully, someone with more history and context with the .buildinfo > file discussions can chime in soonish... > > > live well, > vagrant &References=<20160725202939.GM19933@earth.li> <87k2fybiw2.fsf@aikidev.net>&In-Reply-To=<87k2fybiw2.fsf@aikidev.net>">reply):

From: Vagrant Cascadian <vagrant@debian.org>
To: Jonathan McDowell <noodles@earth.li>, 763822@bugs.debian.org
Cc: reproducible-builds@lists.alioth.debian.org
Subject: Re: Moving towards buildinfo on the archive network
Date: Tue, 02 Aug 2016 21:01:33 -0700
[Message part 1 (text/plain, inline)]
On 2016-07-25, Jonathan McDowell wrote:
> I propose instead a Buildinfo.xz (or gz or whatever) file, which is
> single text file with containing all of the buildinfo information that
> corresponds to the Packages list. What is lost by this approach are the
> OpenPGP signatures that .buildinfo files can have on them. I appreciate
> this is an important part of the reproducible builds aim, but I believe
> one of its strengths is the ability for multiple separate package builds
> to attest that they have used that buildinfo information to build the
> exact same set of binary artefacts. This is not something that easily
> scales on the archive network and I think it is better served by a
> separate service; it would be possible to take the package snippet from
> the buildinfo file and sign that alone, uploading the signature to the
> attestation service. For "normal" Debian operation the usual archive
> signatures would provide a basic level of attestation of chain of build
> information.
>
> The rest of this mail continues on the above assumptions. If you do not
> agree with the above the below is probably null and void, so ignore it
> and instead educate me about what the requirements are and I'll try and
> adjust my ideas based on that.
>
> So. If a single Buildinfo.xz file is acceptable, with the attestation
> being elsewhere, I think this is doable without too much hackery in dak.
> There are some trade-offs to make though, and I need to check which are
> acceptable and which are viewed as too much.

I just wanted to give a huge thanks for taking a good look at this, even
if it isn't exactly what has been specced out by earlier
reproducible-builds discussions. Evaluating a somewhat different
approach, especially if it turns out to be more feasible (at least from
some angles), is really valuable in my eyes.

FWIW, I wasnt involved in the discussions spelling out what the
reproducible builds projects wanted in the archive, so I don't have much
concrete to say, but you've clearly given some serious thought and
effort to this, so I didn't want it to slip through the cracks!

I tried to read through some of the documentation I could find:

  https://wiki.debian.org/ReproducibleBuilds/BuildinfoSpecification
  https://reproducible-builds.org/events/athens2015/debian-buildinfo-review/
  https://reproducible-builds.org/events/athens2015/buildinfo-content/

Having reviewed the above, there doesn't seem to be a huge conflict that
you haven't at least considered already.

Hopefully, someone with more history and context with the .buildinfo
file discussions can chime in soonish...


live well,
  vagrant
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Wed, 03 Aug 2016 09:27:08 GMT) (full text, mbox, link).


Acknowledgement sent to Johannes Schauer <josch@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Wed, 03 Aug 2016 09:27:08 GMT) (full text, mbox, link).


Message #40 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Johannes Schauer <josch@debian.org>
To: 763822@bugs.debian.org
Cc: reproducible-builds@lists.alioth.debian.org
Subject: Re: [Reproducible-builds] Moving towards buildinfo on the archive network
Date: Wed, 03 Aug 2016 11:24:49 +0200
[Message part 1 (text/plain, inline)]
Hi Jonathan,

Quoting Jonathan McDowell (2016-07-25 22:29:39)
> Having been impressed by the current status of reproducible builds and
> the fact it looks like we're close to having the important pieces in
> Debian proper, I have started to have a look at how I could help out
> with this bug. I've done some poking around in the dak code, and think I
> have a vague idea of how to achieve what I think is wanted.

Having tried hacking dak myself, I want to especially thank you for looking
into that!

> (Additionally it is not clear to me where the dpkg status for buildinfo
> creation is; I have heard that it's close to happening, but I can't find
> anything on recent list archives about it - pointers appreciated!)

You are probably aware of #138409?

It scrolled out of my IRC history already but I think guillem said in
#debian-dpkg that releasing a dpkg version with buildinfo support was blocked
by coordination with dak because he wants to make sure that dpkg support aligns
with what dak ends up supporting.

Thanks!

cheers, josch
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sat, 20 Aug 2016 15:15:07 GMT) (full text, mbox, link).


Acknowledgement sent to Ximin Luo <infinity0@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sat, 20 Aug 2016 15:15:07 GMT) (full text, mbox, link).


Message #45 received at 763822@bugs.debian.org (full text, mbox, described in message #10 earlier in this thread, with Message-ID > <87vb8f58rg.fsf@alice.fifthhorseman.net>. > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763822#10 > > Jonathan McDowell: > > Having been impressed by the current status of reproducible builds and > > the fact it looks like we're close to having the important pieces in > > Debian proper, I have started to have a look at how I could help out > > with this bug. I've done some poking around in the dak code, and think I > > have a vague idea of how to achieve what I think is wanted. > > > > First, it is helpful to describe what I think is wanted. What I think we > > need is the archive network to have, alongside the binary packages it > > contains, details of exactly how to build those binaries. This is, I > > believe, the information contained in the .buildinfo files. > > > > In our newest discussions, this purpose is secondary. The primary purpose of > buildinfo files is to record what *one particular builder actually did in order > to produce some output*. Or, equivalently: > > | A buildinfo file, abstractly, is a *claim* C by some builder entity B that > | "I executed process P with env/input I to produce output results R". > > This latter form is slightly easier to reason about, in terms of security > properties. We securely bind the claim C (the contents of the buildinfo file) > to the entity B using a cryptographic signature. > > Note that the builder is a *distinct entity* from the distribution. It's > important to keep the *original* signature by B on C. It breaks our security > logic, to strip the signature and re-sign C using (e.g.) the Debian archive > release keys - because the entity in charge of this release key is not the one > that actually performed the build. Doing this, would allow malicious builders > to re-attribute their misdeeds to look like it's the fault of Debian. > > (Of course there is the special case where the builder *is* Debian, but even in > this case it's good practise to have separate keys for every buildd, plus a > separate release signing key. We can discuss these details separately though.) > > Anyway, that's our "next iteration" definition of buildinfo files, along with a > simplified discussion of the rationale. I wrote down more elsewhere, but I'll > keep this short for now, to avoid overwhelming readers. > > Now back to the "secondary" purpose: > > Using these information "B claims C", other reproduction programs (that we're > also developing) can attempt to actually reproduce the binaries described. It > would do this, by (1) reading the buildinfo file (2) recreating _some_ of the > environment stored in C, and (3) executing the process, and see if it gives R. > > The "_some_" in clause (2) is currently up-for-debate, but the important thing > is that this can be changed in the future *without affecting already-produced > buildinfo files*. It may even well be the case that in the future we'd want to > support different values for "_some_" for a given reproduction tool. > > The main point is that, this is not a concern of the producer nor distributor &References=<20160725202939.GM19933@earth.li> <4605ef24-6407-e03a-9fda-3f445cc15747@debian.org>&subject=Re: [Reproducible-builds] Moving towards buildinfo on the archive network">reply):

From: Ximin Luo <infinity0@debian.org>
To: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>, 763822@bugs.debian.org
Cc: Jonathan McDowell <noodles@earth.li>
Subject: Re: [Reproducible-builds] Moving towards buildinfo on the archive network
Date: Sat, 20 Aug 2016 15:13:00 +0000
Hey, Lunar has stopped doing reproducible builds as a regular thing, and I'm
taking over his previous responsibilities. I was also the main other person in
formulating the ideas behind the "next iteration" of buildinfo, that dkg
described in message #10 earlier in this thread, with Message-ID
<87vb8f58rg.fsf@alice.fifthhorseman.net>.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=763822#10

Jonathan McDowell:
> Having been impressed by the current status of reproducible builds and
> the fact it looks like we're close to having the important pieces in
> Debian proper, I have started to have a look at how I could help out
> with this bug. I've done some poking around in the dak code, and think I
> have a vague idea of how to achieve what I think is wanted.
> 
> First, it is helpful to describe what I think is wanted. What I think we
> need is the archive network to have, alongside the binary packages it
> contains, details of exactly how to build those binaries. This is, I
> believe, the information contained in the .buildinfo files.
> 

In our newest discussions, this purpose is secondary. The primary purpose of
buildinfo files is to record what *one particular builder actually did in order
to produce some output*. Or, equivalently:

  | A buildinfo file, abstractly, is a *claim* C by some builder entity B that
  | "I executed process P with env/input I to produce output results R".

This latter form is slightly easier to reason about, in terms of security
properties. We securely bind the claim C (the contents of the buildinfo file)
to the entity B using a cryptographic signature.

Note that the builder is a *distinct entity* from the distribution. It's
important to keep the *original* signature by B on C. It breaks our security
logic, to strip the signature and re-sign C using (e.g.) the Debian archive
release keys - because the entity in charge of this release key is not the one
that actually performed the build. Doing this, would allow malicious builders
to re-attribute their misdeeds to look like it's the fault of Debian.

(Of course there is the special case where the builder *is* Debian, but even in
this case it's good practise to have separate keys for every buildd, plus a
separate release signing key. We can discuss these details separately though.)

Anyway, that's our "next iteration" definition of buildinfo files, along with a
simplified discussion of the rationale. I wrote down more elsewhere, but I'll
keep this short for now, to avoid overwhelming readers.

Now back to the "secondary" purpose:

Using these information "B claims C", other reproduction programs (that we're
also developing) can attempt to actually reproduce the binaries described. It
would do this, by (1) reading the buildinfo file (2) recreating _some_ of the
environment stored in C, and (3) executing the process, and see if it gives R.

The "_some_" in clause (2) is currently up-for-debate, but the important thing
is that this can be changed in the future *without affecting already-produced
buildinfo files*. It may even well be the case that in the future we'd want to
support different values for "_some_" for a given reproduction tool.

The main point is that, this is not a concern of the producer nor distributor
of the buildinfo files. I.e.: you guys (the FTP team) only have to care about
making these signed-claims available to be downloaded by users, and it is up to
the users to run a tool that "interprets" these claims for purposes such as
actually attempting reproduction of a binary.

In this way, we achieve full end-to-end security properties (verifiability of
build) between the producers (builders) and consumers (users). Distributors
only need to care about availiability, they take no part in the security
(except for the case where they are also a builder, as noted already).

> This bug has previously talked about a tarball of .buildinfo files,
> presented as Buildinfos.tgz alongside the Packages file. From looking at
> the current architecture of dak I do not believe that this is an easy
> option.
> 
> I propose instead a Buildinfo.xz (or gz or whatever) file, which is
> single text file with containing all of the buildinfo information that
> corresponds to the Packages list. What is lost by this approach are the
> OpenPGP signatures that .buildinfo files can have on them. I appreciate
> this is an important part of the reproducible builds aim, but I believe
> one of its strengths is the ability for multiple separate package builds
> to attest that they have used that buildinfo information to build the
> exact same set of binary artefacts. This is not something that easily
> scales on the archive network and I think it is better served by a
> separate service; it would be possible to take the package snippet from
> the buildinfo file and sign that alone, uploading the signature to the
> attestation service. For "normal" Debian operation the usual archive
> signatures would provide a basic level of attestation of chain of build
> information.
> 

I have trouble imagining what could make Buildinfo.tgz hard, but make
Buildinfo.xz easy - could you explain this in more detail, please?

Regarding the OpenPGP signatures, they are vital - but I also see no need to
strip them in your model. From the point-of-view of the FTP archive, there is
no immediate need to read or understand the contents of the buildinfo file. [*]
It's just a dumb data blob, it shouldn't matter to Debian whether it's
clearsigned or not.

Separately, it's OK for the Debian release key to sign this dumb data blob, so
that users can check it is part of a real Debian release - but understand that
the *reproducible* security property is checked against the *builders* and not
the release infrastructure.

[*] You might read it later for "more advanced" behaviours, but we'll leave
these out of the current discussion, we haven't designed those yet.

> The rest of this mail continues on the above assumptions. If you do not
> agree with the above the below is probably null and void, so ignore it
> and instead educate me about what the requirements are and I'll try and
> adjust my ideas based on that.
> 

In the below, you refer to a "database" but there is no mention of this above.
Did you neglect to edit something? I now feel like what you meant by "single
text file" is not at all how I imagined it - e.g. a concatenation of all
buildinfo files, which is not much different from a tar archive.

Also I think my explanation above differs significantly from your existing
understanding of the concept, so I'll wait for you to review that as well.

I'll stop detailed comments here, in case I am missing something. Just a few
minor extra points though:

> [..]
> 
> Firstly, there is currently no concept of "build ids" that I can see [..]

I don't imagine a significant use-case where people will want a *specific*
buildinfo file, but if this is needed I guess we could just use the hash of the
whole file (including signature). The majority use-case would be:

  | Given a single binary package b with hash H, give me all buildinfo files C
  | that claim H as an output.

> [..] This assumes we never have the same version of a
> package with different binaries produced [..]

I believe my explanation of the "next iteration" concept addresses this issue, and this is one of the reasons why we chose to alter Lunar's original ideas from the first post.

> Secondly, buildinfo files that I've seen so far include arch all .debs
> with the architecture .debs. [..]
> 
> Thirdly, as the information is generated from a database, [..]

As mentioned, I'll stop commenting here to let you get synced with the latest ideas.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 21 Aug 2016 14:18:03 GMT) (full text, mbox, link).


Acknowledgement sent to Jonathan McDowell <noodles@earth.li>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 21 Aug 2016 14:18:04 GMT) (full text, mbox, link).


Message #50 received at 763822@bugs.debian.org (full text, mbox, > > > This latter form is slightly easier to reason about, in terms of > > security properties. We securely bind the claim C (the contents of the > > buildinfo file) to the entity B using a cryptographic signature. > > I think the problem here is it's not clear (on either side) who "we" or > "our" means. Different people want different things from reproducible > builds, or have different opinions about relative priorities. > > As a *minimum* I think distributions should be providing the information > of how a particular binary was produced. I suppose what it sort of maps > to is "I executed process P with env/input I to produce output results > R" (though, of course, distros already provide R; that's the binaries > shipped). You've used all the letters I might want to refer to it by, so > let's call it Z. > > The claim, C, is a signature over Z by B. It's useful extra information, > but it's not required for me to ensure that the source I have build the > binaries I have. > > > Note that the builder is a *distinct entity* from the distribution. > > It's important to keep the *original* signature by B on C. It breaks > > our security logic, to strip the signature and re-sign C using (e.g.) > > the Debian archive release keys - because the entity in charge of this > > release key is not the one that actually performed the build. Doing > > this, would allow malicious builders to re-attribute their misdeeds to > > look like it's the fault of Debian. > > Debian already does this in the context of the fact that Package files > etc are signed by the archive key. It's possible to go and grab the .dsc > file to see who did the file build, but day-to-day no one is using these > to verify the binaries they receive. I care more that Debian stands > behind the packages I download than being able to verify individually > who build each of the packages I'm running - there's no meaningful way I > can attribute trust to *all* of the people who packaged something I have > installed. > > > Now back to the "secondary" purpose: > > &In-Reply-To=<20160821141444.GI19933@earth.li>">reply):

From: Jonathan McDowell <noodles@earth.li>
To: Ximin Luo <infinity0@debian.org>
Cc: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>, 763822@bugs.debian.org
Subject: Re: [Reproducible-builds] Moving towards buildinfo on the archive network
Date: Sun, 21 Aug 2016 15:14:44 +0100
[Message part 1 (text/plain, inline)]
On Sat, Aug 20, 2016 at 03:13:00PM +0000, Ximin Luo wrote:
> Jonathan McDowell:
> > Having been impressed by the current status of reproducible builds
> > and the fact it looks like we're close to having the important
> > pieces in Debian proper, I have started to have a look at how I
> > could help out with this bug. I've done some poking around in the
> > dak code, and think I have a vague idea of how to achieve what I
> > think is wanted.
> > 
> > First, it is helpful to describe what I think is wanted. What I
> > think we need is the archive network to have, alongside the binary
> > packages it contains, details of exactly how to build those
> > binaries. This is, I believe, the information contained in the
> > .buildinfo files.
> > 
> In our newest discussions, this purpose is secondary. The primary
> purpose of buildinfo files is to record what *one particular builder
> actually did in order to produce some output*. Or, equivalently:
>
>   | A buildinfo file, abstractly, is a *claim* C by some builder entity B that
>   | "I executed process P with env/input I to produce output results R".
>
> This latter form is slightly easier to reason about, in terms of
> security properties. We securely bind the claim C (the contents of the
> buildinfo file) to the entity B using a cryptographic signature.

I think the problem here is it's not clear (on either side) who "we" or
"our" means. Different people want different things from reproducible
builds, or have different opinions about relative priorities.

As a *minimum* I think distributions should be providing the information
of how a particular binary was produced. I suppose what it sort of maps
to is "I executed process P with env/input I to produce output results
R" (though, of course, distros already provide R; that's the binaries
shipped). You've used all the letters I might want to refer to it by, so
let's call it Z.

The claim, C, is a signature over Z by B. It's useful extra information,
but it's not required for me to ensure that the source I have build the
binaries I have.

> Note that the builder is a *distinct entity* from the distribution.
> It's important to keep the *original* signature by B on C. It breaks
> our security logic, to strip the signature and re-sign C using (e.g.)
> the Debian archive release keys - because the entity in charge of this
> release key is not the one that actually performed the build. Doing
> this, would allow malicious builders to re-attribute their misdeeds to
> look like it's the fault of Debian.

Debian already does this in the context of the fact that Package files
etc are signed by the archive key. It's possible to go and grab the .dsc
file to see who did the file build, but day-to-day no one is using these
to verify the binaries they receive. I care more that Debian stands
behind the packages I download than being able to verify individually
who build each of the packages I'm running - there's no meaningful way I
can attribute trust to *all* of the people who packaged something I have
installed.

> Now back to the "secondary" purpose:
> 
> Using these information "B claims C", other reproduction programs
> (that we're also developing) can attempt to actually reproduce the
> binaries described. It would do this, by (1) reading the buildinfo
> file (2) recreating _some_ of the environment stored in C, and (3)
> executing the process, and see if it gives R.

You don't need the signature to validate the reproducibility.

> The "_some_" in clause (2) is currently up-for-debate, but the
> important thing is that this can be changed in the future *without
> affecting already-produced buildinfo files*. It may even well be the
> case that in the future we'd want to support different values for
> "_some_" for a given reproduction tool.
> 
> The main point is that, this is not a concern of the producer nor
> distributor of the buildinfo files. I.e.: you guys (the FTP team) only
> have to care about making these signed-claims available to be
> downloaded by users, and it is up to the users to run a tool that
> "interprets" these claims for purposes such as actually attempting
> reproduction of a binary.

To clarify: I am not a member of the FTP team and do not claim to
represent them. I am a DD who was present at the DebConf talk about
reproducible builds, was impressed by how far it's come, and asked how I
could help get what was missing and still required into Debian.

> In this way, we achieve full end-to-end security properties
> (verifiability of build) between the producers (builders) and
> consumers (users). Distributors only need to care about availiability,
> they take no part in the security (except for the case where they are
> also a builder, as noted already).

I think I take a less strict view on this, which may be where some of
the disconnect comes from. I care that Debian stands behind it's builds.
I'd like the builder claims to be available (and my original mail did
talk about the fact I didn't think I was preventing that, just that it's
not necessary something that should be on the entire archive network),
but as something that's mirrored everywhere I am absolutely fine with an
attestation by Debian that it received a build appropriately signed by a
DD. Or that it was able to do a build itself within the buildd network
(either for a non-uploaded arch or if we move to source only uploads).

> > This bug has previously talked about a tarball of .buildinfo files,
> > presented as Buildinfos.tgz alongside the Packages file. From looking at
> > the current architecture of dak I do not believe that this is an easy
> > option.
> > 
> > I propose instead a Buildinfo.xz (or gz or whatever) file, which is
> > single text file with containing all of the buildinfo information that
> > corresponds to the Packages list. What is lost by this approach are the
> > OpenPGP signatures that .buildinfo files can have on them. I appreciate
> > this is an important part of the reproducible builds aim, but I believe
> > one of its strengths is the ability for multiple separate package builds
> > to attest that they have used that buildinfo information to build the
> > exact same set of binary artefacts. This is not something that easily
> > scales on the archive network and I think it is better served by a
> > separate service; it would be possible to take the package snippet from
> > the buildinfo file and sign that alone, uploading the signature to the
> > attestation service. For "normal" Debian operation the usual archive
> > signatures would provide a basic level of attestation of chain of build
> > information.
> > 
> 
> I have trouble imagining what could make Buildinfo.tgz hard, but make
> Buildinfo.xz easy - could you explain this in more detail, please?

Debian's archive information is largely stored within a database; things
like the Packages and Contents files are generated each archive run from
this database, rather than incrementally updating a file. It is easy to
generate a Buildinfo.xz file from information contained within the
database (I have some proof-of-concept code locally that does the
beginnings of this), but generating a tar file like you are describing
is either a case of storing each .buildinfo in the database and
generating the tar each run, or adding and deleting files to an existing
tarball. It seems overly intensive and doesn't really seem to scale.

> Regarding the OpenPGP signatures, they are vital - but I also see no
> need to strip them in your model. From the point-of-view of the FTP
> archive, there is no immediate need to read or understand the contents
> of the buildinfo file. [*] It's just a dumb data blob, it shouldn't
> matter to Debian whether it's clearsigned or not.

What I was trying to do with my proposal was turn it from being a dumb
data blob which wasn't easily mapping to the Debian infrastructure, to
something where almost all the information (everything except the actual
signature from the original builder) could be provided alongside the
binaries themselves, enabling people to have what they required to
confirm they could reproduce the builds themselves. *I* think this is
incredibly useful, even if it doesn't achieve everything possible with
reproducible-builds, and I also think that it would provide a sound
basis for another Debian service (perhaps under debian.net to start
with) where multiple builders (starting with the original builder) would
be able to upload their claims, based directly off the buildinfo
information from the archive network. Yes, that's probably an extra
step for the original builder, but it also (to me) seems to be more
flexible and a stronger statement as multiple independent builders can
all confirm things in a single place.

It sounds like this isn't compatible with where reproducible-builds is
heading though, so apologies for the noise.

J.

-- 
"Reality Or Nothing!" -- Cold Lazarus
This .sig brought to you by the letter D and the number  4
Product of the Republic of HuggieTag
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 21 Aug 2016 16:03:14 GMT) (full text, mbox, link).


Acknowledgement sent to Ximin Luo <infinity0@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 21 Aug 2016 16:03:14 GMT) (full text, mbox, link).


Message #55 received at 763822@bugs.debian.org (full text, mbox, > is *exactly* what is required to *not* have to > > "attribute trust of *all* of the people who packaged something I have installed." > > and that is one major (probably the main) goal of R-B. > > Now that I point this out - do you agree, and does it change your mind on anything you previously said? > > X > > -- > GPG: ed25519/56034877E1F87C35 > GPG: rsa4096/1318EFAC5FBBDBCE > https://github.com/infinity0/pubkeys.git > > &In-Reply-To=<96db0478-a4be-d2cd-ece4-420878cc40e4@debian.org>">reply):

From: Ximin Luo <infinity0@debian.org>
To: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>
Cc: 763822@bugs.debian.org
Subject: Re: [Reproducible-builds] Moving towards buildinfo on the archive network
Date: Sun, 21 Aug 2016 16:01:00 +0000
Jonathan McDowell:
> On Sat, Aug 20, 2016 at 03:13:00PM +0000, Ximin Luo wrote:
>> Note that the builder is a *distinct entity* from the distribution.
>> It's important to keep the *original* signature by B on C. It breaks
>> our security logic, to strip the signature and re-sign C using (e.g.)
>> the Debian archive release keys - because the entity in charge of this
>> release key is not the one that actually performed the build. Doing
>> this, would allow malicious builders to re-attribute their misdeeds to
>> look like it's the fault of Debian.
> 
> Debian already does this in the context of the fact that Package files
> etc are signed by the archive key. It's possible to go and grab the .dsc
> file to see who did the file build, but day-to-day no one is using these
> to verify the binaries they receive. I care more that Debian stands
> behind the packages I download than being able to verify individually
> who build each of the packages I'm running - there's no meaningful way I
> can attribute trust to *all* of the people who packaged something I have
> installed.
> 

You have this backwards.

"Being able to verify individually who build each of the packages I'm running"

is *exactly* what is required to *not* have to 

"attribute trust of *all* of the people who packaged something I have installed."

and that is one major (probably the main) goal of R-B.

Now that I point this out - do you agree, and does it change your mind on anything you previously said?

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 21 Aug 2016 18:06:03 GMT) (full text, mbox, link).


Acknowledgement sent to Jonathan McDowell <noodles@earth.li>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 21 Aug 2016 18:06:03 GMT) (full text, mbox, link).


Message #60 received at 763822@bugs.debian.org (full text, mbox, > running" > > > > is *exactly* what is required to *not* have to > > > > "attribute trust of *all* of the people who packaged something I have > > installed." > > > > and that is one major (probably the main) goal of R-B. > > > > Now that I point this out - do you agree, > > No. What lets me not care about who actually built the packages and have > to attribute trust to them is that I have the build information, which > allows me to verify I get exactly the same output from the provided > source. The signatures over these do not allow me to trust the binaries > I receive in any additional fashion. If I trust the statement "I built > package using source and build information " from an > individual, without doing any verification that this is true, it doesn't > give me much over "I built package using source ". I have to do > the build myself to ensure what I have been told is true. > > Where, to me, signatures become more interesting is when it is possible > for multiple different people to attest they build a set of source using > the same information and got exactly the same output - but only if I > actually trust all the entities who are doing that signing. > > > and does it change your mind on anything you previously said? > > Fundamentally I still think build information without the signature of > the builder is information that it would be useful to have accompanying > the Debian archive. It seems you do not believe this is worth anything > as it loses the signature which provides a chain back to the origin. I > do not, at present, have a good solution for the extra information and > conditions you want within the context of the Debian archive. > > J. &subject=Re: Bug#763822: [Reproducible-builds] Moving towards buildinfo on the archive network">reply):

From: Jonathan McDowell <noodles@earth.li>
To: reproducible-builds@lists.alioth.debian.org
Cc: 763822@bugs.debian.org
Subject: Re: Bug#763822: [Reproducible-builds] Moving towards buildinfo on the archive network
Date: Sun, 21 Aug 2016 19:03:19 +0100
On Sun, Aug 21, 2016 at 04:01:00PM +0000, Ximin Luo wrote:
> Jonathan McDowell:
> > On Sat, Aug 20, 2016 at 03:13:00PM +0000, Ximin Luo wrote:
> >> Note that the builder is a *distinct entity* from the distribution.
> >> It's important to keep the *original* signature by B on C. It breaks
> >> our security logic, to strip the signature and re-sign C using (e.g.)
> >> the Debian archive release keys - because the entity in charge of this
> >> release key is not the one that actually performed the build. Doing
> >> this, would allow malicious builders to re-attribute their misdeeds to
> >> look like it's the fault of Debian.
> > 
> > Debian already does this in the context of the fact that Package files
> > etc are signed by the archive key. It's possible to go and grab the .dsc
> > file to see who did the file build, but day-to-day no one is using these
> > to verify the binaries they receive. I care more that Debian stands
> > behind the packages I download than being able to verify individually
> > who build each of the packages I'm running - there's no meaningful way I
> > can attribute trust to *all* of the people who packaged something I have
> > installed.
> > 
>
> You have this backwards.
> 
> "Being able to verify individually who build each of the packages I'm
> running"
> 
> is *exactly* what is required to *not* have to 
> 
> "attribute trust of *all* of the people who packaged something I have
> installed."
> 
> and that is one major (probably the main) goal of R-B.
> 
> Now that I point this out - do you agree,

No. What lets me not care about who actually built the packages and have
to attribute trust to them is that I have the build information, which
allows me to verify I get exactly the same output from the provided
source. The signatures over these do not allow me to trust the binaries
I receive in any additional fashion. If I trust the statement "I built
package <x> using source <y> and build information <z>" from an
individual, without doing any verification that this is true, it doesn't
give me much over "I built package <x> using source <y>". I have to do
the build myself to ensure what I have been told is true.

Where, to me, signatures become more interesting is when it is possible
for multiple different people to attest they build a set of source using
the same information and got exactly the same output - but only if I
actually trust all the entities who are doing that signing.

> and does it change your mind on anything you previously said?

Fundamentally I still think build information without the signature of
the builder is information that it would be useful to have accompanying
the Debian archive. It seems you do not believe this is worth anything
as it loses the signature which provides a chain back to the origin. I
do not, at present, have a good solution for the extra information and
conditions you want within the context of the Debian archive.

J.

-- 
] http://www.earth.li/~noodles/ [] 101 things you can't have too much  [
]  PGP/GPG Key @ the.earth.li   []        of : 49 - Bandwidth.         [
] via keyserver, web or email.  []                                     [
] RSA: 4096/0x94FA372B2DA8B985  []                                     [



Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 21 Aug 2016 18:24:03 GMT) (full text, mbox, link).


Acknowledgement sent to Ximin Luo <infinity0@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 21 Aug 2016 18:24:04 GMT) (full text, mbox, link).


Message #65 received at 763822@bugs.debian.org (full text, mbox, advanced" behaviours I mentioned in my previous email.) But I wanted to start > off with a simple yet strongly-secure model first. > > What I described is not supposed to contradict the ability for users to > "confirm they could reproduce the builds themselves". As I mentioned, a > majority use-case is to allow others to download "all the buildinfo files for a ">reply):

From: Ximin Luo <infinity0@debian.org>
To: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>
Cc: 763822@bugs.debian.org
Subject: Re: [Reproducible-builds] Moving towards buildinfo on the archive network
Date: Sun, 21 Aug 2016 18:22:00 +0000
Jonathan McDowell:
> On Sat, Aug 20, 2016 at 03:13:00PM +0000, Ximin Luo wrote:
>> I have trouble imagining what could make Buildinfo.tgz hard, but make
>> Buildinfo.xz easy - could you explain this in more detail, please?
> 
> Debian's archive information is largely stored within a database; things
> like the Packages and Contents files are generated each archive run from
> this database, rather than incrementally updating a file. It is easy to
> generate a Buildinfo.xz file from information contained within the
> database (I have some proof-of-concept code locally that does the
> beginnings of this), but generating a tar file like you are describing
> is either a case of storing each .buildinfo in the database and
> generating the tar each run, or adding and deleting files to an existing
> tarball. It seems overly intensive and doesn't really seem to scale.
> 
>> Regarding the OpenPGP signatures, they are vital - but I also see no
>> need to strip them in your model. From the point-of-view of the FTP
>> archive, there is no immediate need to read or understand the contents
>> of the buildinfo file. [*] It's just a dumb data blob, it shouldn't
>> matter to Debian whether it's clearsigned or not.
> 
> What I was trying to do with my proposal was turn it from being a dumb
> data blob which wasn't easily mapping to the Debian infrastructure, to
> something where almost all the information (everything except the actual
> signature from the original builder) could be provided alongside the
> binaries themselves, enabling people to have what they required to
> confirm they could reproduce the builds themselves. *I* think this is
> incredibly useful, even if it doesn't achieve everything possible with
> reproducible-builds, and I also think that it would provide a sound
> basis for another Debian service (perhaps under debian.net to start
> with) where multiple builders (starting with the original builder) would
> be able to upload their claims, based directly off the buildinfo
> information from the archive network. Yes, that's probably an extra
> step for the original builder, but it also (to me) seems to be more
> flexible and a stronger statement as multiple independent builders can
> all confirm things in a single place.
> 
> It sounds like this isn't compatible with where reproducible-builds is
> heading though, so apologies for the noise.
> 

I don't mean to suggest a database is not useful. I thought I was talking to
ftp-masters through you, so I wanted to be very clear about the security
properties we're aiming for, and get common understanding about that first.

But I'm not sure why you say it's incompatible - could you not also store the
detached signatures within the database, and generate the original file
(including signature) from this and the other information? The signatures are
much smaller than the rest of the file.

In fact, we do indeed have longer-term plans for Debian infrastructure to look
into this data and not turn it into a data blob - for example, buildds
themselves could try to reproduce a given buildinfo uploaded by a DD, and send
alerts about packages that can't be reproduced. (I hinted at this by the "more
advanced" behaviours I mentioned in my previous email.) But I wanted to start
off with a simple yet strongly-secure model first.

What I described is not supposed to contradict the ability for users to
"confirm they could reproduce the builds themselves". As I mentioned, a
majority use-case is to allow others to download "all the buildinfo files for a
given binary package", then they check this locally.

Perhaps the confusion is in the suggestion of a single Buildinfo.tgz. Let me
disclaim this for now - I wasn't present for the discussions around why all of
this information needs to be in one file, it actually does *not* make sense to
me. An obvious alternative is to cat all the buildinfo files for a given source
package, into one $source-$version.buildinfos.gz file and store this in pool/.
This would also make it easy to lookup buildinfo files for a given binary
later. Could someone tell me why this approach isn't suitable?

Now going back to "users confirming rebuilds":

The reason why I started off with this high-security dumb-data-blob approach is
to make the security arguments and reasoning very simple and obvious, so it's
harder to accidentally weaken or subvert it in the future. Debian isn't even
involved in the security logic - it's purely the end-user verifier program.

Another benefit of signatures, is that it gives you more information, in the
cases where you might not want to build it yourself (e.g. very large programs).
If you strip this information, then only Debian is "attesting" to a particular
hash (which it didn't even build). If you keep this information, then you can
aggregate multiple peoples' attempts to build a given binary.

Eventually we could have buildinfo-only uploads, just like we have binary-only
or source-only uploads. Then for important binaries like gcc, perhaps 20 people
will want to upload their .buildinfo files to Debian with their signatures
attached, to make us all feel better about that.

Note also in general that you don't actually *want* all of the buildinfo fields
to be the same for everyone. Only the output *has* to be the same, and it is
actually a stronger security property if we get two buildinfo files that started
off with *different* inputs (such as buildpath/time/etc) and got the *same*
binary hashes out.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 21 Aug 2016 18:33:09 GMT) (full text, mbox, link).


Acknowledgement sent to Ximin Luo <infinity0@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 21 Aug 2016 18:33:09 GMT) (full text, mbox, link).


Message #70 received at 763822@bugs.debian.org (full text, mbox, >> running" > >> > >> is *exactly* what is required to *not* have to > >> > >> "attribute trust of *all* of the people who packaged something I have > >> installed." > >> > >> and that is one major (probably the main) goal of R-B. > >> > >> Now that I point this out - do you agree, > > > > No. What lets me not care about who actually built the packages and have > > to attribute trust to them is that I have the build information, which > > allows me to verify I get exactly the same output from the provided > > source. [..] > > > > OK, I explained things badly. For you to actually strictly verify everything, > yes you would have to build everything yourself. But then you should just run a > source-based distribution and forget about other people's binaries completely. > > We do also want to provide quite strong security properties, even for people > that don't want to build every single binary for themselves. That is one very > key point of R-B. If we assume everyone will check reproducibility of their own > binaries, this renders the whole exercise of R-B pointless. > > Thanks for pointing this out, so that I explain it better. I'm completely at > fault here. > > Signatures provide a way to for us to aggregate public trust on binaries that > don't build themselves. So it's important to have multiple and *very direct* > meanings of what-is-being-signed, to avoid a transitive-trust situation. > > X > > -- > GPG: ed25519/56034877E1F87C35 > GPG: rsa4096/1318EFAC5FBBDBCE > https://github.com/infinity0/pubkeys.git > > &References=<20160725202939.GM19933@earth.li> <4605ef24-6407-e03a-9fda-3f445cc15747@debian.org> <20160821141444.GI19933@earth.li> <96db0478-a4be-d2cd-ece4-420878cc40e4@debian.org> <20160821180319.GJ19933@earth.li> ">reply):

From: Ximin Luo <infinity0@debian.org>
To: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>
Cc: 763822@bugs.debian.org
Subject: Re: [Reproducible-builds] Bug#763822: Moving towards buildinfo on the archive network
Date: Sun, 21 Aug 2016 18:31:00 +0000
Jonathan McDowell:
> On Sun, Aug 21, 2016 at 04:01:00PM +0000, Ximin Luo wrote:
>> You have this backwards.
>>
>> "Being able to verify individually who build each of the packages I'm
>> running"
>>
>> is *exactly* what is required to *not* have to 
>>
>> "attribute trust of *all* of the people who packaged something I have
>> installed."
>>
>> and that is one major (probably the main) goal of R-B.
>>
>> Now that I point this out - do you agree,
> 
> No. What lets me not care about who actually built the packages and have
> to attribute trust to them is that I have the build information, which
> allows me to verify I get exactly the same output from the provided
> source. [..]
> 

OK, I explained things badly. For you to actually strictly verify everything,
yes you would have to build everything yourself. But then you should just run a
source-based distribution and forget about other people's binaries completely.

We do also want to provide quite strong security properties, even for people
that don't want to build every single binary for themselves. That is one very
key point of R-B. If we assume everyone will check reproducibility of their own
binaries, this renders the whole exercise of R-B pointless.

Thanks for pointing this out, so that I explain it better. I'm completely at
fault here.

Signatures provide a way to for us to aggregate public trust on binaries that
don't build themselves. So it's important to have multiple and *very direct*
meanings of what-is-being-signed, to avoid a transitive-trust situation.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 21 Aug 2016 18:36:03 GMT) (full text, mbox, link).


Acknowledgement sent to Ximin Luo <infinity0@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 21 Aug 2016 18:36:03 GMT) (full text, mbox, link).


Message #75 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Ximin Luo <infinity0@debian.org>
To: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>
Cc: 763822@bugs.debian.org
Subject: Re: [Reproducible-builds] Bug#763822: Moving towards buildinfo on the archive network
Date: Sun, 21 Aug 2016 18:33:00 +0000
Ximin Luo:
> Signatures provide a way to for us to aggregate public trust on binaries that
> don't build themselves. So it's important to have multiple and *very direct*
> meanings of what-is-being-signed, to avoid a transitive-trust situation.
> 

I sent this in a rush; better version:

Signatures provide a way to for us to aggregate public trust on binaries that
people don't build themselves. So it's important to have multiple and *very
direct* meanings of what-is-being-signed, strongly associated to the signer,
to avoid a transitive-trust situation.

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Thu, 10 Nov 2016 19:18:02 GMT) (full text, mbox, link).


Acknowledgement sent to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Thu, 10 Nov 2016 19:18:02 GMT) (full text, mbox, link).


Message #80 received at 763822@bugs.debian.org (full text, mbox, this statement, and none of us know the precise details involved. Originally > Lunar proposed a design with 1 large file, but there are issues with this as > well, such as the performance of updates. > > Here are our current main requirements as stated by dkg in message #10, and I > confirm they're still accurate as of today: > > 1. We want an archive user to be able to find and fetch all .buildinfo files that produced a given binary package > 2. We want the eventual possibility of multiple .buildinfo files per > 3. We understsand that mirror operators don't like small files because rsync gets fussy with them. > 4. We want both buildds and debian developers to be able to upload .buildinfo files. > > (4) by itself is easy; people have already written code to allow dak to accept > such files and discard them. > > So we need to figure out how to reconcile (1,2,3). For this, it would be good > if you could tell me in more detail what the restriction (3) consists of. > > We would never be uploading 10,000k buildinfo files at once, but Mattia tells > me that 1k might happen during medium binNMU transitions, growing up to 4k for > large transitions (but this would be over several days, i.e. split across > multiple runs of dinstall). Each buildinfo file is about 5.4k (median), with > 7.7k as the 75% percentile, though the largest is 148k. [1] > > There is also the distinction between uploading vs mirroring. Just because we > might upload 1k files over a short time, does not mean that we have to transfer > these to mirrors as 1k files. We could tar some of them up and compress them. > > So could you clarify some details regarding upload resource limits, as well as > mirroring resource limits? > > For example, is one extra file per source-package OK or "too much"? Or one > extra file per binary upload? How about one extra file-update, of the same > file, per binary upload? (I assume that rsync means we are free to update any > files that we store in pool/, if we need to?) > > More clarifications to the above, regarding what we *don't* need: &In-Reply-To=<20161110191353.GA15446@layer-acht.org>&subject=Re: FWD: Clarification regarding FTP resource constraints for buildinfo files">reply):

From: Holger Levsen <holger@layer-acht.org>
To: 763822@bugs.debian.org
Cc: reproducible-builds@lists.alioth.debian.org
Subject: FWD: Clarification regarding FTP resource constraints for buildinfo files
Date: Thu, 10 Nov 2016 19:13:53 +0000
[Message part 1 (text/plain, inline)]
Hi,

actually forwarding this to the bug.

And adding a small note that since August we now have
buildinfo.debian.net, so maybe for a start it would be sufficient if dak
would submit these .buildinfo files via curl/https to buildinfo.d.n!?!

----- Forwarded message from Ximin Luo <infinity0@debian.org> -----

Date: Wed, 24 Aug 2016 13:16:00 +0000
From: Ximin Luo <infinity0@debian.org>
To: ftpmaster@debian.org
Cc: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>
Subject: [Reproducible-builds] Clarification regarding FTP resource constraints for buildinfo files
Reply-To: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>

Hi, I'm emailing to follow-up regarding #763822. I know we have not yet come up
with a concrete proposal on that, and that is largely because we were waiting
for comments regarding the resource constraints of ftp-master and mirrors.

There is broad understanding across the R-B team that you'd prefer a design
that does not involve "lots of small files", but there's a lot of breadth in
this statement, and none of us know the precise details involved. Originally
Lunar proposed a design with 1 large file, but there are issues with this as
well, such as the performance of updates.

Here are our current main requirements as stated by dkg in message #10, and I
confirm they're still accurate as of today:

1. We want an archive user to be able to find and fetch all .buildinfo files that produced a given binary package
2. We want the eventual possibility of multiple .buildinfo files per <srcpkg,version,arch>
3. We understsand that mirror operators don't like small files because rsync gets fussy with them.
4. We want both buildds and debian developers to be able to upload .buildinfo files.

(4) by itself is easy; people have already written code to allow dak to accept
such files and discard them.

So we need to figure out how to reconcile (1,2,3). For this, it would be good
if you could tell me in more detail what the restriction (3) consists of.

We would never be uploading 10,000k buildinfo files at once, but Mattia tells
me that 1k might happen during medium binNMU transitions, growing up to 4k for
large transitions (but this would be over several days, i.e. split across
multiple runs of dinstall). Each buildinfo file is about 5.4k (median), with
7.7k as the 75% percentile, though the largest is 148k. [1]

There is also the distinction between uploading vs mirroring. Just because we
might upload 1k files over a short time, does not mean that we have to transfer
these to mirrors as 1k files. We could tar some of them up and compress them.

So could you clarify some details regarding upload resource limits, as well as
mirroring resource limits?

For example, is one extra file per source-package OK or "too much"? Or one
extra file per binary upload? How about one extra file-update, of the same
file, per binary upload? (I assume that rsync means we are free to update any
files that we store in pool/, if we need to?)

More clarifications to the above, regarding what we *don't* need:

N1. It's not essential to store 1 uploaded-buildinfo-file per file-in-the-archive, as long as we can still do (1).
N2. We don't care particularly about being able to get *a specific buildinfo-file*, as long as we can still do (1).
N3. It's OK to over-satisfy (1) with extra irrelevant data, then the user can just filter this out locally.

We have more ideas, but I think it's best to keep this email short for now.
Also I don't know what is feasible until I hear more details about the
constraints, and it would be pointless to skip further ahead to potentially
unfeasible things.

X

[1] (use wget, too big for browser) https://tests.reproducible-builds.org/debian/buildinfo/unstable/amd64/?C=S;O=A

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

_______________________________________________
Reproducible-builds mailing list
Reproducible-builds@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/reproducible-builds

----- End forwarded message -----


-- 
cheers,
	Holger
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sat, 02 Sep 2017 21:51:02 GMT) (full text, mbox, link).


Acknowledgement sent to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sat, 02 Sep 2017 21:51:02 GMT) (full text, mbox, link).


Message #85 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Holger Levsen <holger@layer-acht.org>
To: debian-devel@lists.debian.org, debian-admin@lists.debian.org, reproducible-builds@lists.alioth.debian.org, 763822@bugs.debian.org
Subject: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)
Date: Sat, 2 Sep 2017 21:48:41 +0000
[Message part 1 (text/plain, inline)]
On Mon, Jul 03, 2017 at 07:23:29PM +0200, Philipp Kern wrote:
> > Not yet.  We people from the reproducible team couldn't find a way to
> > usefully talk to ftp-masters people, whom never replied to any of the
> > questions in the thread at #763822 (they only did some quick comments on
> > IRC, and we have been left on guessing what they would like…).
> > 
> > Anyhow, .buildinfo files are stored in ftp-master, just not exported to
> > the mirrors, you can find them in
> > coccia.debian.org:/srv/ftp-master.debian.org/<something>.
> 
> So I suppose we talk about 13 GB[1] of static content in about 1.7M
> files. Is that something that could be distributed through
> static.debian.org if there are concerns around inodes for the main
> mirrors? Given that they would be accessed mostly rarely[2]?
> 
> [1] 7.7kB (75%ile as mentioned in the referenced bug) * 55000 binary
> packages * 10 architectures * 3 versions - so quite conservatively
> [2] So supposedly a CDN wouldn't bring a lot of benefit as individual
> files aren't likely to be hit frequently.

using static.debian.org seems to be a good idea to me, what would be needed to make
this happen?

or, we could put them in a git repo instead, and use git.debian.org…

feedback welcome.


-- 
cheers,
	Holger
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 03 Sep 2017 00:51:03 GMT) (full text, mbox, link).


Acknowledgement sent to Paul Wise <pabs@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 03 Sep 2017 00:51:03 GMT) (full text, mbox, link).


Message #90 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Paul Wise <pabs@debian.org>
To: debian-devel@lists.debian.org, debian-admin@lists.debian.org, reproducible-builds@lists.alioth.debian.org, 763822@bugs.debian.org
Subject: Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)
Date: Sun, 03 Sep 2017 08:46:25 +0800
[Message part 1 (text/plain, inline)]
On Sat, 2017-09-02 at 21:48 +0000, Holger Levsen wrote:

> > So I suppose we talk about 13 GB[1] of static content in about 1.7M
> > files. Is that something that could be distributed through
> > static.debian.org if there are concerns around inodes for the main
> > mirrors? Given that they would be accessed mostly rarely[2]?
> > 
> > [1] 7.7kB (75%ile as mentioned in the referenced bug) * 55000 binary
> > packages * 10 architectures * 3 versions - so quite conservatively

I had a quick look at the (currently) 4 systems behind static.d.o and
it looks like they can all take the extra space and inodes. senfter
only has 48GB space left but we can expand the storage there.
mirror-csail only has 64M inodes available, but should be fine.

One concern might be the rsync time for 1.7M inodes, I'm not sure if
our static setup does sites in parallel.

There might be other factors here that I'm not aware of, hopefully
other DSA folks can fill them in.

Are these files going to only be available for the versions of packages
that exist in the archive right now, or is it going to be a historical
archive of all Debian build information forever?
paralel
What kind of growth per year are we expecting?

> using static.debian.org seems to be a good idea to me, what would be needed to make
> this happen?

Some patches to files in dsa-puppet to define the service:

modules/roles/manifests/static_mirror.pp
modules/roles/misc/static-components.yaml
modules/roles/templates/static-mirroring/vhost/static-vhosts-simple.erb
modules/sudo/files/sudoers

https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/

> or, we could put them in a git repo instead, and use git.debian.org…

It strikes me as quite a lot of data for one git repo :)

-- 
bye,
pabs

https://wiki.debian.org/PaulWise
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 03 Sep 2017 10:06:02 GMT) (full text, mbox, link).


Acknowledgement sent to Philipp Kern <pkern@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 03 Sep 2017 10:06:02 GMT) (full text, mbox, link).


Message #95 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Philipp Kern <pkern@debian.org>
To: Holger Levsen <holger@layer-acht.org>
Cc: debian-devel@lists.debian.org, debian-admin@lists.debian.org, reproducible-builds@lists.alioth.debian.org, 763822@bugs.debian.org
Subject: Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)
Date: Sun, 03 Sep 2017 11:40:53 +0200
On 2017-09-02 23:48, Holger Levsen wrote:
> On Mon, Jul 03, 2017 at 07:23:29PM +0200, Philipp Kern wrote:
>> > Not yet.  We people from the reproducible team couldn't find a way to
>> > usefully talk to ftp-masters people, whom never replied to any of the
>> > questions in the thread at #763822 (they only did some quick comments on
>> > IRC, and we have been left on guessing what they would like…).
>> >
>> > Anyhow, .buildinfo files are stored in ftp-master, just not exported to
>> > the mirrors, you can find them in
>> > coccia.debian.org:/srv/ftp-master.debian.org/<something>.
>> 
>> So I suppose we talk about 13 GB[1] of static content in about 1.7M
>> files. Is that something that could be distributed through
>> static.debian.org if there are concerns around inodes for the main
>> mirrors? Given that they would be accessed mostly rarely[2]?
>> 
>> [1] 7.7kB (75%ile as mentioned in the referenced bug) * 55000 binary
>> packages * 10 architectures * 3 versions - so quite conservatively
>> [2] So supposedly a CDN wouldn't bring a lot of benefit as individual
>> files aren't likely to be hit frequently.
> 
> using static.debian.org seems to be a good idea to me, what would be
> needed to make
> this happen?
> 
> or, we could put them in a git repo instead, and use git.debian.org…

Git is an interesting thought for incremental mirroring. But then it 
also seems to be a poor choice for something that is an only growing 
repository of data.

What I think should be a requirement is that the data is pushed out 
before the mirror pulse. Otherwise you end up with a race where you try 
to mirror the data including the buildinfo but can't access it. (It's a 
little unfortunate that we don't simply put them onto the mirrors.

Kind regards
Philipp Kern



Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sun, 03 Sep 2017 11:45:03 GMT) (full text, mbox, link).


Acknowledgement sent to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sun, 03 Sep 2017 11:45:03 GMT) (full text, mbox, link).


Message #100 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Holger Levsen <holger@layer-acht.org>
To: Philipp Kern <pkern@debian.org>
Cc: debian-devel@lists.debian.org, debian-admin@lists.debian.org, reproducible-builds@lists.alioth.debian.org, 763822@bugs.debian.org
Subject: Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)
Date: Sun, 3 Sep 2017 11:43:50 +0000
[Message part 1 (text/plain, inline)]
On Sun, Sep 03, 2017 at 11:40:53AM +0200, Philipp Kern wrote:
> Git is an interesting thought for incremental mirroring. But then it also
> seems to be a poor choice for something that is an only growing repository
> of data.

the nice thing with git is that you get a signed tree for free (or rather, very
easily with tools almost everybody understands), even though it atm only uses
sha1 hashes. IOW: it's a very simple blockchain, which has better properties
than a simple file based mirror.
 
> What I think should be a requirement is that the data is pushed out before
> the mirror pulse. Otherwise you end up with a race where you try to mirror
> the data including the buildinfo but can't access it. (It's a little
> unfortunate that we don't simply put them onto the mirrors.

agreed.


-- 
cheers,
	Holger
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Thu, 05 Apr 2018 08:45:06 GMT) (full text, mbox, link).


Acknowledgement sent to Philipp Kern <pkern@debian.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Thu, 05 Apr 2018 08:45:06 GMT) (full text, mbox, link).


Message #105 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Philipp Kern <pkern@debian.org>
To: Holger Levsen <holger@layer-acht.org>
Cc: debian-devel@lists.debian.org, debian-admin@lists.debian.org, reproducible-builds@lists.alioth.debian.org, 763822@bugs.debian.org, ftpmaster@debian.org
Subject: Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)
Date: Thu, 5 Apr 2018 10:43:04 +0200
[Message part 1 (text/plain, inline)]
On 9/3/17 11:40 AM, Philipp Kern wrote:
> On 2017-09-02 23:48, Holger Levsen wrote:
>> On Mon, Jul 03, 2017 at 07:23:29PM +0200, Philipp Kern wrote:
>>> > Not yet.  We people from the reproducible team couldn't find a way to
>>> > usefully talk to ftp-masters people, whom never replied to any of the
>>> > questions in the thread at #763822 (they only did some quick
>>> comments on
>>> > IRC, and we have been left on guessing what they would like…).
>>> >
>>> > Anyhow, .buildinfo files are stored in ftp-master, just not
>>> exported to
>>> > the mirrors, you can find them in
>>> > coccia.debian.org:/srv/ftp-master.debian.org/<something>.
>>>
>>> So I suppose we talk about 13 GB[1] of static content in about 1.7M
>>> files. Is that something that could be distributed through
>>> static.debian.org if there are concerns around inodes for the main
>>> mirrors? Given that they would be accessed mostly rarely[2]?
>>>
>>> [1] 7.7kB (75%ile as mentioned in the referenced bug) * 55000 binary
>>> packages * 10 architectures * 3 versions - so quite conservatively
>>> [2] So supposedly a CDN wouldn't bring a lot of benefit as individual
>>> files aren't likely to be hit frequently.
>>
>> using static.debian.org seems to be a good idea to me, what would be
>> needed to make
>> this happen?
>>
>> or, we could put them in a git repo instead, and use git.debian.org…
> 
> Git is an interesting thought for incremental mirroring. But then it
> also seems to be a poor choice for something that is an only growing
> repository of data.
> 
> What I think should be a requirement is that the data is pushed out
> before the mirror pulse. Otherwise you end up with a race where you try
> to mirror the data including the buildinfo but can't access it. (It's a
> little unfortunate that we don't simply put them onto the mirrors.

So what would be needed to make at least a simple export of the data
happen? I think the requirements I'd have are these:

* Data is sufficiently fresh and optimally accessible before the mirror
pulse happens so that you can always fetch the corresponding buildinfo
for a newly pushed package.
* Some way of actually deducing the path to the buildinfo file, either
through some sort of redirector or by naming the files in a consistent
fashion.

Right now the second point does not work with the date-based farm that
is used to archive the buildinfo files. It would work if we were to just
apply the same splitting as in the regular pool. For the former just
pushing the content through static.d.o should work and dak could push
the content before pushing the mirrors?

Intuitively I would not care about cryptographic authentication of the
data. After all it can be verified by rebuilding if the package is
reproducible.

Kind regards and thanks
Philipp Kern

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Sat, 21 Apr 2018 19:21:09 GMT) (full text, mbox, link).


Acknowledgement sent to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Sat, 21 Apr 2018 19:21:09 GMT) (full text, mbox, link).


Message #110 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Holger Levsen <holger@layer-acht.org>
To: Philipp Kern <pkern@debian.org>
Cc: debian-devel@lists.debian.org, debian-admin@lists.debian.org, reproducible-builds@lists.alioth.debian.org, 763822@bugs.debian.org, ftpmaster@debian.org
Subject: Re: distributing .buildinfo files (Re: Bad interaction between pbuilder/debhelper/dpkg-buildinfo/dpkg-genchanges and dak on security-master)
Date: Sat, 21 Apr 2018 19:16:35 +0000
[Message part 1 (text/plain, inline)]
On Thu, Apr 05, 2018 at 10:43:04AM +0200, Philipp Kern wrote:
> So what would be needed to make at least a simple export of the data
> happen? I think the requirements I'd have are these:

that's a good question! :)

maybe we can sit together with some ftp-team and reproducible builds
folks in Hamburg and finalize the design and implement it? 

> * Data is sufficiently fresh and optimally accessible before the mirror
> pulse happens so that you can always fetch the corresponding buildinfo
> for a newly pushed package.
> * Some way of actually deducing the path to the buildinfo file, either
> through some sort of redirector or by naming the files in a consistent
> fashion.
> 
> Right now the second point does not work with the date-based farm that
> is used to archive the buildinfo files. It would work if we were to just
> apply the same splitting as in the regular pool. For the former just
> pushing the content through static.d.o should work and dak could push
> the content before pushing the mirrors?
> 
> Intuitively I would not care about cryptographic authentication of the
> data. After all it can be verified by rebuilding if the package is
> reproducible.

agreed with all of these points, thanks!


-- 
cheers,
	Holger
[signature.asc (application/pgp-signature, inline)]

Information forwarded to debian-bugs-dist@lists.debian.org, Debian FTP Master <ftpmaster@ftp-master.debian.org>:
Bug#763822; Package ftp.debian.org. (Thu, 26 Sep 2019 11:57:02 GMT) (full text, mbox, link).


Acknowledgement sent to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Debian FTP Master <ftpmaster@ftp-master.debian.org>. (Thu, 26 Sep 2019 11:57:02 GMT) (full text, mbox, link).


Message #115 received at 763822@bugs.debian.org (full text, mbox, reply):

From: Holger Levsen <holger@layer-acht.org>
To: 763822@bugs.debian.org
Cc: Reproducible Builds discussion list <reproducible-builds@lists.alioth.debian.org>
Subject: #763822: ftp.debian.org: please include .buildinfo files in the archive
Date: Thu, 26 Sep 2019 11:46:27 +0000
[Message part 1 (text/plain, inline)]
Dear ftp team,

it would be very cool if you could comment on this bug, even though
there is https://buildinfos.debian.net/ftp-master.debian.org/ and
https://buildinfos.debian.net/buildinfo-pool/ now.

Distributing .buildinfo files is one requirement for making reproducible
builds a reality for our users and while buildinfos.debian.net is a
working proof-of-concept having .buildinfos in the archive and on the
mirrors is IMHO how we should distribute them 'for real'.

What do you think? I'd really appreciate some comments how to move
forward here.


-- 
cheers,
	Holger

-------------------------------------------------------------------------------
               holger@(debian|reproducible-builds|layer-acht).org
       PGP fingerprint: B8BF 5413 7B09 D35C F026 FE9D 091A B856 069A AA1C

[signature.asc (application/pgp-signature, inline)]

Severity set to 'normal' from 'wishlist' Request was from Luca Falavigna <dktrkranz@debian.org> to control@bugs.debian.org. (Sun, 11 Sep 2022 12:45:05 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


Debian bug tracking system administrator <owner@bugs.debian.org>. Last modified: Fri Jan 31 00:52:31 2025; Machine Name: bembo

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU General Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.