Acknowledgement sent
to Holger Levsen <holger@layer-acht.org>:
New Bug report received and forwarded. Copy sent to reproducible-builds@lists.alioth.debian.org, Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>.
(Tue, 19 Sep 2023 16:12:03 GMT) (full text, mbox, link).
package: diffoscope
version: 240
severity: important
x-debbugs-cc: reproducible-builds@lists.alioth.debian.org
hi,
On Tue, Sep 05, 2023 at 10:05:58PM +0200, FC Stegerman wrote:
> It worked (and was probably needed) before as the "--" was interpreted
> by schroot, not diffoscope. So the solution should be to remove the
> "--" and just use:
> diffoscope --version
thanks, Mattia implemented this in the meantime.
and then I did some commit so we can see on which packages diffoscope
cause the machine go into absurd loads (and sometimes make diffoscope
crash) crashes and the results have been interesting, out of 87
cases where diffoscope didnt finish this happened on:
47 trixie_i386
21 unstable_i386
5 unstable_armhf
4 trixie_arm64
4 trixie_amd64
3 unstable_arm64
2 trixie_armhf
1 experimental_amd64
so I guess my short-term measure will be to disable i386 testing...
(which I consider just wildly poking around.)
Also, this is caused mostly when running diffoscope on 3-4 distinct packages
(so far):
21 ocaml-obuild_trixie_i386
8 pgocaml_trixie_i386
8 omake_unstable_i386
8 ocaml-dune_unstable_i386
8 ocaml-dune_trixie_i386
6 ben_trixie_i386
2 wpewebkit_trixie_arm64
2 hevea_trixie_i386
2 astropy_unstable_armhf
Please not that this is diffoscope running on amd64, testing those packages
build for any arch. (using diffoscope from bookworm currently but I dont see
any major changes between diffoscope from bookworm and sid currently.)
Looking at https://tests.reproducible-builds.org/debian/history/ocaml-dune.html
it shows usual build times of very few minutes, however the last test for trixie/i386 took
over 2h. (?!?)
ocaml-obuild is the package which was tried most in the last days, what this
clearly shows is that the package is build twice, then diffoscope appearantly
crashes and the packages is tested again and again and again:
root@jenkins:/var/log/reproducible-builds# ls *ocaml-obuild* -lart
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 09:29 diffoscope_stamp_ocaml-obuild_trixie_i386_1695029345
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 09:39 diffoscope_stamp_ocaml-obuild_trixie_i386_1695029961
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 09:47 diffoscope_stamp_ocaml-obuild_trixie_i386_1695030452
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 09:55 diffoscope_stamp_ocaml-obuild_trixie_i386_1695030952
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 10:07 diffoscope_stamp_ocaml-obuild_trixie_i386_1695031674
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 11:20 diffoscope_stamp_ocaml-obuild_trixie_i386_1695036023
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 11:31 diffoscope_stamp_ocaml-obuild_trixie_i386_1695036663
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 11:41 diffoscope_stamp_ocaml-obuild_trixie_i386_1695037307
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 11:50 diffoscope_stamp_ocaml-obuild_trixie_i386_1695037840
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 12:04 diffoscope_stamp_ocaml-obuild_trixie_i386_1695038698
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 14:40 diffoscope_stamp_ocaml-obuild_trixie_i386_1695048004
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 15:05 diffoscope_stamp_ocaml-obuild_trixie_i386_1695049528
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 15:19 diffoscope_stamp_ocaml-obuild_trixie_i386_1695050391
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 15:35 diffoscope_stamp_ocaml-obuild_trixie_i386_1695051344
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 15:51 diffoscope_stamp_ocaml-obuild_trixie_i386_1695052268
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 16:01 diffoscope_stamp_ocaml-obuild_trixie_i386_1695052915
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 16:11 diffoscope_stamp_ocaml-obuild_trixie_i386_1695053485
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 16:28 diffoscope_stamp_ocaml-obuild_trixie_i386_1695054506
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 16:45 diffoscope_stamp_ocaml-obuild_trixie_i386_1695055539
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 17:04 diffoscope_stamp_ocaml-obuild_trixie_i386_1695056683
-rw-r--r-- 1 jenkins jenkins 0 Sep 18 17:32 diffoscope_stamp_ocaml-obuild_trixie_i386_1695058357
I'm filing this as a bug in the Debian BTS now to benefit from x-debbugs-cc :)
--
cheers,
Holger
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org
⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
⠈⠳⣄
Reporter: You're the first person ever to win two Olympic tennis gold medals.
That's an extraordinary feat, isn't it?
Andy Murray: I think Venus and Serena have won about four each.
Acknowledgement sent
to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>.
(Wed, 20 Sep 2023 12:27:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>.
(Wed, 20 Sep 2023 19:54:02 GMT) (full text, mbox, link).
On Wed, Sep 20, 2023 at 12:25:36PM +0000, Holger Levsen wrote:
> so I've powercycled the machine and also disabled the armhf workers now.a
this didnt really help:
mosh last connect some 50m ago:
13 109 19:00:30 up 6:41, 3 users, load average: 151.64, 205.31, 156.09
10 112 19:01:00 up 6:41, 4 users, load average: 105.52, 189.05, 152.25
10 115 19:01:42 up 6:42, 4 users, load average: 135.72, 183.98, 152.19
# ls /var/log/reproducible builds
diffoscope_stamp_gmsh_trixie_arm64_1695232788
diffoscope_stamp_telegram-desktop_unstable_amd64_1695232787
so still no idea how to deal with this... :/
--
cheers,
Holger
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org
⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
⠈⠳⣄
I'll believe in climate change when Texas freezes over. (Ted Cruz)
Acknowledgement sent
to Holger Levsen <holger@layer-acht.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>.
(Wed, 27 Sep 2023 14:18:03 GMT) (full text, mbox, link).
hi,
On Wed, Sep 20, 2023 at 12:25:36PM +0000, Holger Levsen wrote:
> so I've powercycled the machine and also disabled the armhf workers now.
> (under the (weak) assumption that this bug is mostly trigged when running
> diffoscope on 32bit .debs...)
so on Sep 23rd I made diffoscope run under ionice -n6, and so far the machine
has gone down on its knees yet.
and on Sep 25th I said this on #debian-reproducible:
when i made diffoscope run under ionice i was wondering if there were changes in the linux
scheduler causing the probs we saw... now not seeing the machine go down to its knees
within 60h uptime i'm saying this to "provoke" the problem coming back
however, looking at
jenkins.debian.net/munin/static/dynazoom.html?cgiurl_graph=/munin-cgi/munin-cgi-graph&plugin_name=debian.net/jenkins.debian.net/uptime&size_x=800&size_y=400&start_epoch=1661108749&stop_epoch=1695668749
I can that the highest uptime was 15 days..
(since upgrading to bookworm which is when the probs started)
it could also be kernel related, with switching to bookworm
we switched from 5.10.0 to 6.1.0...
so my next stabs c/would be:
a.) increase swap
b.) upgrade to kernel 6.4.0 from bpo
c.) something else
Today someone also suggested to use zram to compress swap...
and, fwiw, https://tests.reproducible-builds.org/debian/index_performance.html
still doesnt look good for yesterday...
--
cheers,
Holger
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org
⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
⠈⠳⣄
I don't want to see your smile.
I want to see your intelligence, compassion, integrity, and consideration.
(@1goodtern)
Acknowledgement sent
to Vagrant Cascadian <vagrant@reproducible-builds.org>:
Extra info received and forwarded to list. Copy sent to Reproducible builds folks <reproducible-builds@lists.alioth.debian.org>.
(Wed, 27 Sep 2023 16:03:04 GMT) (full text, mbox, link).
On 2023-09-27, Holger Levsen wrote:
> On Wed, Sep 20, 2023 at 12:25:36PM +0000, Holger Levsen wrote:
>> so I've powercycled the machine and also disabled the armhf workers now.
>> (under the (weak) assumption that this bug is mostly trigged when running
>> diffoscope on 32bit .debs...)
Most frustrating!
> so on Sep 23rd I made diffoscope run under ionice -n6, and so far the machine
> has gone down on its knees yet.
>
> and on Sep 25th I said this on #debian-reproducible:
>
> when i made diffoscope run under ionice i was wondering if there were changes in the linux
> scheduler causing the probs we saw... now not seeing the machine go down to its knees
> within 60h uptime i'm saying this to "provoke" the problem coming back
> however, looking at
> jenkins.debian.net/munin/static/dynazoom.html?cgiurl_graph=/munin-cgi/munin-cgi-graph&plugin_name=debian.net/jenkins.debian.net/uptime&size_x=800&size_y=400&start_epoch=1661108749&stop_epoch=1695668749
> I can that the highest uptime was 15 days..
> (since upgrading to bookworm which is when the probs started)
> it could also be kernel related, with switching to bookworm
> we switched from 5.10.0 to 6.1.0...
> so my next stabs c/would be:
> a.) increase swap
> b.) upgrade to kernel 6.4.0 from bpo
> c.) something else
Maybe try using a bullseye 5.10.x kernel for a while? Obviously better
if the issue is fixed in a newer kernel version ... but if 5.10.x
consistently works with bookworm userspace that ever so slightly narrows
down the issue?
live well,
vagrant
hi,
i'm closing this bug as we've solved / migated most of the related issues
and because the bug has been mixing many different issues, eg also
#1086379.
--
cheers,
Holger
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ holger@(debian|reproducible-builds|layer-acht).org
⢿⡄⠘⠷⠚⠋⠀ OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
⠈⠳⣄
"I became an antifascist out of a sense of common decency.” – Marlene Dietrich
Debbugs is free software and licensed under the terms of the GNU General
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.