Page MenuHomePhabricator

Mentee overview module: Run updateMenteeData.php regularly
Closed, ResolvedPublic

Description

According to the specs at T278971: Mentor dashboard: M1 mentee overview module , the table in mentee overview module is supposed to be updated daily.

This means we have to add updateMenteeData.php as a regular job run by the maintenance server. Code for the script itself was already merged to master in the parent task, and will ship to prod with wmf.14.

Checklist
  • Test updateMenteeData.php works as intended in the beta cluster
  • Wait for deployment to production (wmf.14)
  • Test updateMenteeData.php works as intended in production (at least the pilots), check the numbers are correct
  • Write a Puppet patch to run updateMenteeData.php daily in production and beta
  • Ask a SRE to deploy the puppet patch
  • Observe the script for a while (check logs, check numbers stay right)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Mentioned in SAL (#wikimedia-releng) [2021-06-29T21:45:33Z] <urbanecm> urbanecm@deployment-deploy01:~$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=cswiki # T285811

For the beta cluster part:

urbanecm@deployment-deploy01:~$ sql cswiki
MariaDB [cswiki]> select * from growthexperiments_mentee_data;
Empty set (0.00 sec)

MariaDB [cswiki]> Bye
urbanecm@deployment-deploy01:~$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=cswiki # T285811
urbanecm@deployment-deploy01:~$ sql cswiki
MariaDB [cswiki]> select mentee_data from growthexperiments_mentee_data\G
*************************** 1. row ***************************
mentee_data: {"username":"Martin Urbanec","reverted":0,"questions":1,"editcount":20,"registration":"20191102161354","last_active":"20210428160314","blocks":0}
*************************** 2. row ***************************
mentee_data: {"username":"Tgr","reverted":0,"questions":0,"editcount":5,"registration":"20191102161411","last_active":"20210503074638","blocks":1}
*************************** 3. row ***************************
mentee_data: {"username":"ET221","reverted":0,"questions":0,"editcount":3,"registration":"20191105021347","last_active":"20210226022321","blocks":0}
*************************** 4. row ***************************
mentee_data: {"username":"ET10","reverted":0,"questions":1,"editcount":68,"registration":"20191211231919","last_active":"20210319003956","blocks":0}
*************************** 5. row ***************************
mentee_data: {"username":"Rho2017","reverted":0,"questions":4,"editcount":26,"registration":"20191212120813","last_active":"20210408155334","blocks":0}
*************************** 6. row ***************************
mentee_data: {"username":"ET18","reverted":0,"questions":0,"editcount":17,"registration":"20200114223640","last_active":"20210315222330","blocks":0}
*************************** 7. row ***************************
mentee_data: {"username":"ET13","reverted":8,"questions":16,"editcount":69,"registration":"20200225202255","last_active":"20210526012903","blocks":0}
*************************** 8. row ***************************
mentee_data: {"username":"Bluedot","reverted":0,"questions":1,"editcount":5,"registration":"20200316144201","last_active":"20210407203150","blocks":0}
*************************** 9. row ***************************
mentee_data: {"username":"ET271","reverted":0,"questions":0,"editcount":3,"registration":"20200520002139","last_active":"20210319234051","blocks":0}
*************************** 10. row ***************************
mentee_data: {"username":"ET370","reverted":0,"questions":4,"editcount":10,"registration":"20200920201304","last_active":"20210126011724","blocks":0}
*************************** 11. row ***************************
mentee_data: {"username":"ET7","reverted":0,"questions":0,"editcount":3,"registration":"20201023224437","last_active":"20210319004934","blocks":0}
*************************** 12. row ***************************
mentee_data: {"username":"ET1002","reverted":0,"questions":0,"editcount":1,"registration":"20201103221119","last_active":"20210305200704","blocks":0}
*************************** 13. row ***************************
mentee_data: {"username":"ET1005","reverted":0,"questions":0,"editcount":3,"registration":"20201104212539","last_active":"20210123012413","blocks":0}
*************************** 14. row ***************************
mentee_data: {"username":"ET2015","reverted":0,"questions":0,"editcount":1,"registration":"20210109014257","last_active":"20210109020618","blocks":0}
*************************** 15. row ***************************
mentee_data: {"username":"ET73","reverted":0,"questions":0,"editcount":2,"registration":"20210202193519","last_active":"20210202193745","blocks":0}
*************************** 16. row ***************************
mentee_data: {"username":"Patrik L.","reverted":0,"questions":0,"editcount":1,"registration":"20210203080502","last_active":"20210428164823","blocks":1}
*************************** 17. row ***************************
mentee_data: {"username":"ET1001","reverted":0,"questions":0,"editcount":1,"registration":"20210218220322","last_active":"20210218222743","blocks":0}
*************************** 18. row ***************************
mentee_data: {"username":"ET217","reverted":0,"questions":0,"editcount":1,"registration":"20210301212132","last_active":"20210305195950","blocks":0}
*************************** 19. row ***************************
mentee_data: {"username":"ET195","reverted":0,"questions":0,"editcount":2,"registration":"20210317174259","last_active":"20210319002748","blocks":0}
*************************** 20. row ***************************
mentee_data: {"username":"ET1951","reverted":0,"questions":0,"editcount":4,"registration":"20210318151350","last_active":"20210318200905","blocks":0}
*************************** 21. row ***************************
mentee_data: {"username":"MMiller Beta cs 01","reverted":0,"questions":0,"editcount":1,"registration":"20210322210738","last_active":"20210322211031","blocks":0}
21 rows in set (0.00 sec)

MariaDB [cswiki]> Bye
urbanecm@deployment-deploy01:~$

It did generate some data, and the format looks expected. Spot checking the data, I indeed have 20 edits, last one is from April 28, Tgr and Patrik L. indeed were blocked two times. So far so good. I'll do a more thorough check when we're in prod.

kostajh subscribed.

@Urbanecm_WMF I'm assuming this should be in current sprint; moving the task there.

Mentioned in SAL (#wikimedia-operations) [2021-07-14T09:27:17Z] <urbanecm> [urbanecm@mwmaint2002 /srv/mediawiki/php-1.37.0-wmf.14]$ time mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=testwiki # T285811

Now that wmf.14 is at testwikis, I ran the script manually for testwiki. Here is what I can see quickly:

[urbanecm@mwmaint2002 /srv/mediawiki/php-1.37.0-wmf.14]$ sql testwiki --cluster=extension1
[email protected](testwiki)> select count(*) from growthexperiments_mentee_data;
 ---------- 
| count(*) |
 ---------- 
|        0 |
 ---------- 
1 row in set (0.00 sec)

[email protected](testwiki)> select count(*) from growthexperiments_mentor_mentee;
 ---------- 
| count(*) |
 ---------- 
|      512 |
 ---------- 
1 row in set (0.00 sec)

[email protected](testwiki)> Bye
[urbanecm@mwmaint2002 /srv/mediawiki/php-1.37.0-wmf.14]$ time mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=testwiki                                                                                         
real    0m1.666s
user    0m0.508s
sys     0m0.072s
[urbanecm@mwmaint2002 /srv/mediawiki/php-1.37.0-wmf.14]$ sql testwiki --cluster=extension1
[email protected](testwiki)> select count(*) from growthexperiments_mentee_data;
 ---------- 
| count(*) |
 ---------- 
|       80 |
 ---------- 
1 row in set (0.00 sec)

[email protected](testwiki)> select count(*) from growthexperiments_mentor_mentee;
 ---------- 
| count(*) |
 ---------- 
|      512 |
 ---------- 
1 row in set (0.00 sec)

[email protected](testwiki)> select * from growthexperiments_mentee_data limit 10;
 ----------- -------------------------------------------------------------------------------------------------------------------------------------------------------- 
| mentee_id | mentee_data                                                                                                                                            |
 ----------- -------------------------------------------------------------------------------------------------------------------------------------------------------- 
|       254 | {"username":"Tgr","reverted":0,"questions":0,"editcount":28,"registration":"20060713135958","last_active":"20210711111411","blocks":0}                 |
|       752 | {"username":"Geraki","reverted":0,"questions":0,"editcount":16,"registration":"20070412102512","last_active":"20210511101928","blocks":0}              |
|      2279 | {"username":"Kanashimi","reverted":65,"questions":0,"editcount":266,"registration":"20080621052148","last_active":"20210705060500","blocks":0}         |
|      2565 | {"username":"Catrope","reverted":0,"questions":0,"editcount":91,"registration":"20080714110313","last_active":"20210225210739","blocks":0}             |
|      3312 | {"username":"Iniquity","reverted":1,"questions":2,"editcount":262,"registration":"20080905091652","last_active":"20210201112736","blocks":2}           |
|      6151 | {"username":"HenkvD","reverted":0,"questions":0,"editcount":7,"registration":"20090226192251","last_active":"20210610144634","blocks":0}               |
|     12061 | {"username":"Suffusion of Yellow","reverted":1,"questions":1,"editcount":77,"registration":"20091213124416","last_active":"20210619191506","blocks":0} |
|     16861 | {"username":"DonRumata","reverted":8,"questions":0,"editcount":218,"registration":"20110621181207","last_active":"20210313121935","blocks":0}          |
|     19016 | {"username":"Jdlrobson","reverted":0,"questions":0,"editcount":346,"registration":"20120309182458","last_active":"20210130182015","blocks":0}          |
|     23278 | {"username":"Dyolf77","reverted":0,"questions":0,"editcount":1,"registration":"20130822011838","last_active":"20210330184032","blocks":0}              |
 ----------- -------------------------------------------------------------------------------------------------------------------------------------------------------- 
10 rows in set (0.00 sec)

[email protected](testwiki)> Bye
[urbanecm@mwmaint2002 /srv/mediawiki/php-1.37.0-wmf.14]$

Format is expected. A second for 512 mentees is an okay time for a background maintenance script. I'll measure it at a pilot wiki too, when the train will ship the code there.

I'll start writing a notebook to validate the numbers it generated, in order to make sure I don't have any mistake in the queries in the code.

Change 704506 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/puppet@production] mediawiki/maintenance/growthexperiments.pp: Run updateMenteeData every day

https://gerrit.wikimedia.org/r/704506

Mentioned in SAL (#wikimedia-operations) [2021-07-15T20:11:00Z] <urbanecm> [urbanecm@mwmaint2002 /srv/mediawiki/php]$ time mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=cswiki # T285811

Mentioned in SAL (#wikimedia-operations) [2021-07-15T20:26:13Z] <urbanecm> [urbanecm@mwmaint2002 /srv/mediawiki/php]$ time mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=bnwiki # T285811

Ran and checked for cswiki and bnwiki, via the same notebook:

  • cswiki: ERROR: Found 3 mismatched IDs: {397014: ['editcount', 'last_active'], 463649: ['editcount', 'last_active'], 544633: ['editcount', 'last_active']}
  • bnwiki: All ok

I'll look into the three wrong users soon.

Mentioned in SAL (#wikimedia-operations) [2021-07-15T20:44:15Z] <urbanecm> [urbanecm@mwmaint2002 /srv/mediawiki/php]$ time mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=viwiki # T285811

Ran and checked for cswiki and bnwiki, via the same notebook:

  • cswiki: ERROR: Found 3 mismatched IDs: {397014: ['editcount', 'last_active'], 463649: ['editcount', 'last_active'], 544633: ['editcount', 'last_active']}
  • bnwiki: All ok

I'll look into the three wrong users soon.

The three users just happened to edit while the script was running, which is normal and can happen. Otherwise the cswiki data is correct.

I finished running it for all the four pilot wikis (arwiki, bnwiki, cswiki and viwiki).

arwiki took 102 minutes, which is understandable, as it has 221193 mentees (out of which 22444 meets the conditions to be included in the module).

This is the list of top 10 GE wikis in terms of number of mentees:

dbname mentees
enwiki259752
arwiki221193
frwiki173907
ruwiki105211
ptwiki98318
viwiki70833
fawiki65471
trwiki50960
kowiki43922
cswiki29598

(generated via P16829)

I'm bit unsure if enabling the script at enwiki is going to work fine -- it has over 250k mentees in 2.5 months and grows about 100k mentees/month. Maybe we should add a config variable to turn the script off on enwiki for this timebeing?

Change 705658 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@master] updateMenteeData: Make it possible to disable script per-wiki

https://gerrit.wikimedia.org/r/705658

Change 705658 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] updateMenteeData: Make it possible to disable script per-wiki

https://gerrit.wikimedia.org/r/705658

Change 705748 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@wmf/1.37.0-wmf.14] updateMenteeData: Make it possible to disable script per-wiki

https://gerrit.wikimedia.org/r/705748

Change 705749 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@wmf/1.37.0-wmf.15] updateMenteeData: Make it possible to disable script per-wiki

https://gerrit.wikimedia.org/r/705749

Change 705748 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.37.0-wmf.14] updateMenteeData: Make it possible to disable script per-wiki

https://gerrit.wikimedia.org/r/705748

Change 705749 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.37.0-wmf.15] updateMenteeData: Make it possible to disable script per-wiki

https://gerrit.wikimedia.org/r/705749

Change 705740 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Set wgGEMentorDashboardBackendEnabled properly

https://gerrit.wikimedia.org/r/705740

Mentioned in SAL (#wikimedia-operations) [2021-07-20T20:49:40Z] <urbanecm@deploy1002> Synchronized php-1.37.0-wmf.14/extensions/GrowthExperiments/maintenance/updateMenteeData.php: dafd953eb5cd35bddbd2fd348b03066420a42362: updateMenteeData: Make it possible to disable script per-wiki (T285811) (duration: 00m 58s)

Change 705740 merged by jenkins-bot:

[operations/mediawiki-config@master] Set wgGEMentorDashboardBackendEnabled properly

https://gerrit.wikimedia.org/r/705740

Mentioned in SAL (#wikimedia-operations) [2021-07-20T20:53:16Z] <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: caa5a076f39b051b01622aa3e4c9d716a8643eef: Set wgGEMentorDashboardBackendEnabled properly (T285811) (duration: 00m 57s)

Change 705742 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] labs: Enable mentor dashboard backend everywhere

https://gerrit.wikimedia.org/r/705742

Change 705742 merged by jenkins-bot:

[operations/mediawiki-config@master] labs: Enable mentor dashboard backend everywhere

https://gerrit.wikimedia.org/r/705742

Change 704506 merged by RLazarus:

[operations/puppet@production] mediawiki/maintenance/growthexperiments.pp: Run updateMenteeData every day

https://gerrit.wikimedia.org/r/704506

Mentioned in SAL (#wikimedia-operations) [2021-07-21T16:49:59Z] <urbanecm> [urbanecm@mwmaint2002 ~]$ time /usr/local/bin/mw-cli-wrapper /usr/local/bin/foreachwikiindblist /srv/mediawiki/dblists/growthexperiments.dblist extensions/GrowthExperiments/maintenance/updateMenteeData.php # T285811

updateMenteeData.php was deployed as a regular job by the SREs, logs will be at mwmaint2002:/var/log/mediawiki/mediawiki_job_growthexperiments-updateMenteeData (scheduled to run at 4:15 UTC). I just ran it as an oneoff to see how fast that will be when ran sequentially.

I'll look tomorrow to make sure everything works as intended.

updateMenteeData.php was deployed as a regular job by the SREs, logs will be at mwmaint2002:/var/log/mediawiki/mediawiki_job_growthexperiments-updateMenteeData (scheduled to run at 4:15 UTC). I just ran it as an oneoff to see how fast that will be when ran sequentially.

I'll look tomorrow to make sure everything works as intended.

I verified the job got dispatched, that the logs look sane and that the cswiki numbers are correct. I think this can be now called resolved.