Dumps (snapshot*) hosts should be migrated to buster around the same time we upgrade our other mediawiki clusters.
Description
Details
Event Timeline
I can do the testbed host first, and then the rest. Do we have a mediawiki server on buster anywhere in the cluster yet?
Yes, mwdebug1003 is running Buster, you can select it with the latest version of the WikimediaDebug browser extension.
I've built th package and set up a test instance in deployment-prep, but there's issues with mediawiki scripts there; see T273089 for the details.
Change 659886 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] snapshot1007 (testbed host) install with buster
Tests of xml/sql dumps in buster instance in deployment-prep look good. Next step: reimage the snapshot testbed instance in production.
Change 659886 merged by ArielGlenn:
[operations/puppet@production] snapshot1007 (testbed host) install with buster
Script wmf-auto-reimage was launched by ariel on cumin1001.eqiad.wmnet for hosts:
snapshot1007.eqiad.wmnet
The log can be found in /var/log/wmf-auto-reimage/202101291442_ariel_5091_snapshot1007_eqiad_wmnet.log.
Completed auto-reimage of hosts:
['snapshot1007.eqiad.wmnet']
and were ALL successful.
Change 659957 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] dumps: add a config for xml/sql dumps that writes elsewhere than prod dirs
Change 659957 merged by ArielGlenn:
[operations/puppet@production] dumps: add a config for xml/sql dumps that writes elsewhere than prod dirs
Test run of elwikiquote on reimaged testbed server running buster looks good, but I should do a prefetch run tomorrow morning just to be extra sure. Then I'll be able to switch the testbed with an xml/sql dump runner for regular wikis, in time for the Feb 1 run, and see how it goes.
Change 660634 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] make snapshot1007 running buster a dumpsrunner and move testbed to 1005
Change 660634 merged by ArielGlenn:
[operations/puppet@production] make snapshot1007 running buster a dumpsrunner and move testbed to 1005
The prefetch runs went well. I ran a small wiki on snapshot1007 (buster) and then on snapshot1005 (stretch) on the same hardware. The times were slightly faster on buster.
Assuming that all goes well with the production run on buster, which we should know in 6 or 7 days, I'll be able to convert snapshot1010 next.
Change 660779 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] add a proper media section to the deployment-prep dumps config file
Change 660779 merged by ArielGlenn:
[operations/puppet@production] add a proper media section to the deployment-prep dumps config file
Change 660781 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] Make media lists dump easily runnable in deployment-prep
Change 660781 merged by ArielGlenn:
[operations/puppet@production] Make media lists dump easily runnable in deployment-prep
I have tested in deployment-prep all of the "other" dumps (not xml/sql) except for the wikidata and adds-changes dumps. Those are next.
Change 660819 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] make adds-changes dumps easier to test in deployment-prep
Change 660819 merged by ArielGlenn:
[operations/puppet@production] make adds-changes dumps easier to test in deployment-prep
Change 660871 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] make wikidata rdf dumps easier to test in deployment-prep
Change 661170 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] refactor script for wikidata and commons rdf dumps
Change 661642 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] prep for re-install of snapshot1009, 1010 with buster
Change 661642 merged by ArielGlenn:
[operations/puppet@production] prep for re-install of snapshot1009, 1010 with buster
Script wmf-auto-reimage was launched by ariel on cumin1001.eqiad.wmnet for hosts:
snapshot1009.eqiad.wmnet
The log can be found in /var/log/wmf-auto-reimage/202102040754_ariel_21056_snapshot1009_eqiad_wmnet.log.
Completed auto-reimage of hosts:
['snapshot1009.eqiad.wmnet']
and were ALL successful.
snapshot1009 was idle so I converted it. snapshot1010 should become idle in an hour or two, so I'll be able to do that later today. I might not do anything about snapshot1005,6 since they are due to be replaced and the replacements should be here any day now. Thy can simply be installed with buster from the start and the old servers decommissioned.
Script wmf-auto-reimage was launched by ariel on cumin1001.eqiad.wmnet for hosts:
snapshot1010.eqiad.wmnet
The log can be found in /var/log/wmf-auto-reimage/202102041248_ariel_6167_snapshot1010_eqiad_wmnet.log.
Completed auto-reimage of hosts:
['snapshot1010.eqiad.wmnet']
and were ALL successful.
snapshot1010 is done. I need to do a bunch more testing before I can reimage snapshot1008.
Change 662756 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[mediawiki/core@master] in deployment-prep some groups don't exist, permit scripts that use them to run
Change 662756 merged by jenkins-bot:
[mediawiki/core@master] in deployment-prep some groups don't exist, permit scripts that use them to run
Change 661170 merged by ArielGlenn:
[operations/puppet@production] refactor script for wikidata and commons rdf dumps
While it would be nice to continue to make the wikidata entity dumps more easy to run in deployment-prep, it can wait a bit while I move to testing the wikidata json dumps, next needed for the move to buster.
Change 663661 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] refactor wikidata json dumps to be easier to test on deployment-prep
Change 663661 merged by ArielGlenn:
[operations/puppet@production] refactor wikidata json dumps to be easier to test on deployment-prep
Change 664091 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] now that snapshot1005 is testbed host, make snapshot1007 the enwiki dumps runner
Change 664091 merged by ArielGlenn:
[operations/puppet@production] now that snapshot1005 is testbed host, make snapshot1007 the enwiki dumps runner
Change 664092 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] prep snapshot1005 and 1006 for reinstall with buster
Change 664092 merged by ArielGlenn:
[operations/puppet@production] prep snapshot1005 and 1006 for reinstall with buster
Script wmf-auto-reimage was launched by ariel on cumin1001.eqiad.wmnet for hosts:
snapshot1005.eqiad.wmnet
The log can be found in /var/log/wmf-auto-reimage/202102150817_ariel_8905_snapshot1005_eqiad_wmnet.log.
Completed auto-reimage of hosts:
['snapshot1005.eqiad.wmnet']
and were ALL successful.
Script wmf-auto-reimage was launched by ariel on cumin1001.eqiad.wmnet for hosts:
snapshot1006.eqiad.wmnet
The log can be found in /var/log/wmf-auto-reimage/202102150912_ariel_1945_snapshot1006_eqiad_wmnet.log.
Change 664225 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] misc dumps: move commons rdf to later on Sunday and media info to earlier
Completed auto-reimage of hosts:
['snapshot1006.eqiad.wmnet']
and were ALL successful.
I was not going to re-image snapshot1005 and 6 because their replacements were due to have come in, but the boxes have not arrived yet and we still do not have an eta. So they are done now.
The last server remaining is snapshot1008. All "misc" dumps have been tested on beta in deployment-prep, and so the re-imaging of this host can happen next Sunday. I am rearranging the Sunday cron jobs a little so that we have a longer maintenance window going forward, see https://gerrit.wikimedia.org/r/c/operations/puppet/ /664225
Change 664225 merged by ArielGlenn:
[operations/puppet@production] misc dumps: move commons rdf to later on Sunday and media info to earlier
Script wmf-auto-reimage was launched by ariel on cumin1001.eqiad.wmnet for hosts:
snapshot1008.eqiad.wmnet
The log can be found in /var/log/wmf-auto-reimage/202102210918_ariel_28889_snapshot1008_eqiad_wmnet.log.
Change 665583 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] snapshot1008 to install from buster image
Change 665583 merged by ArielGlenn:
[operations/puppet@production] snapshot1008 to install from buster image
Completed auto-reimage of hosts:
['snapshot1008.eqiad.wmnet']
and were ALL successful.
So the reimage completed but still on stretch. I've updated the install file and here we go again.
Script wmf-auto-reimage was launched by ariel on cumin1001.eqiad.wmnet for hosts:
snapshot1008.eqiad.wmnet
The log can be found in /var/log/wmf-auto-reimage/202102210952_ariel_3141_snapshot1008_eqiad_wmnet.log.
Completed auto-reimage of hosts:
['snapshot1008.eqiad.wmnet']
and were ALL successful.