Page MenuHomePhabricator

Unpublish Add Link models for wikis where it did not work
Closed, ResolvedPublic

Description

In T370559: Review wikis that have Add Link models but do not have Add Link enabled, we reviewed discrepancy between published Add Link models and set of wikis where Add Link is deployed. In that task, @Trizek-WMF noted:

  • akwiki was part of round 4, but has not enough content to suggest links T304548#8008475
  • nawiki was excluded (not enough recommendations) T308137#9127717

We should unpublish models for wikis where they did not work. Once the model is improved, the models can be republished.

List of wikis to unpublish
  • akwiki
  • nawiki

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Mentioned in SAL (#wikimedia-operations) [2024-08-01T12:39:38Z] <urbanecm> Decommission Add Link models for akwiki, nawiki (T371598)

Urbanecm_WMF triaged this task as Low priority.
[urbanecm@stat1008 /home/mgerlach/REPOS/mwaddlink-gerrit]$ WIKI_ID=akwiki ./unpublish-datasets.sh 
[...]
akwiki has been delisted from the index.
[urbanecm@stat1008 /home/mgerlach/REPOS/mwaddlink-gerrit]$ WIKI_ID=nawiki ./unpublish-datasets.sh 
[...]
nawiki has been delisted from the index.
[urbanecm@stat1008 /home/mgerlach/REPOS/mwaddlink-gerrit]$ published-sync 
/usr/bin/flock -n /var/lock/published-sync -c /usr/bin/rsync -rptL -v --delete /srv/published/ analytics-web.discovery.wmnet::published-destination/stat1008//
sending incremental file list
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.w2vfiltered.sqlite.gz
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.w2vfiltered.sqlite.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.redirects.sqlite.gz
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.redirects.sqlite.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.pageids.sqlite.gz
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.pageids.sqlite.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.linkmodel.json.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.linkmodel.json
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.anchors.sqlite.gz
deleting datasets/one-off/research-mwaddlink/nawiki/nawiki.anchors.sqlite.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/lr_nawiki_w2vfiltered.sql.gz.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/lr_nawiki_w2vfiltered.sql.gz
deleting datasets/one-off/research-mwaddlink/nawiki/lr_nawiki_redirects.sql.gz.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/lr_nawiki_redirects.sql.gz
deleting datasets/one-off/research-mwaddlink/nawiki/lr_nawiki_pageids.sql.gz.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/lr_nawiki_pageids.sql.gz
deleting datasets/one-off/research-mwaddlink/nawiki/lr_nawiki_anchors.sql.gz.checksum
deleting datasets/one-off/research-mwaddlink/nawiki/lr_nawiki_anchors.sql.gz
deleting datasets/one-off/research-mwaddlink/nawiki/README
deleting datasets/one-off/research-mwaddlink/nawiki/
deleting datasets/one-off/research-mwaddlink/akwiki/lr_akwiki_w2vfiltered.sql.gz.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/lr_akwiki_w2vfiltered.sql.gz
deleting datasets/one-off/research-mwaddlink/akwiki/lr_akwiki_redirects.sql.gz.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/lr_akwiki_redirects.sql.gz
deleting datasets/one-off/research-mwaddlink/akwiki/lr_akwiki_pageids.sql.gz.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/lr_akwiki_pageids.sql.gz
deleting datasets/one-off/research-mwaddlink/akwiki/lr_akwiki_anchors.sql.gz.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/lr_akwiki_anchors.sql.gz
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.w2vfiltered.sqlite.gz
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.w2vfiltered.sqlite.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.redirects.sqlite.gz
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.redirects.sqlite.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.pageids.sqlite.gz
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.pageids.sqlite.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.linkmodel.json.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.linkmodel.json
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.anchors.sqlite.gz
deleting datasets/one-off/research-mwaddlink/akwiki/akwiki.anchors.sqlite.checksum
deleting datasets/one-off/research-mwaddlink/akwiki/README
deleting datasets/one-off/research-mwaddlink/akwiki/
datasets/one-off/research-mwaddlink/
datasets/one-off/research-mwaddlink/wikis.txt

sent 425,887 bytes  received 3,752 bytes  859,278.00 bytes/sec
total size is 174,322,503,623  speedup is 405,741.81
[urbanecm@stat1008 /home/mgerlach/REPOS/mwaddlink-gerrit]$

https://analytics.wikimedia.org/published/datasets/one-off/research-mwaddlink/ still lists those two wikis, which is due to caching. Moving to QA to verify the models disappear, and verify link-recommendation service still works.

Ok, disappeared from both https://analytics.wikimedia.org/published/datasets/one-off/research-mwaddlink/wikis.txt and https://analytics.wikimedia.org/published/datasets/one-off/research-mwaddlink/. Removed from the API service as well:

[email protected](mwaddlink)> drop table lr_akwiki_anchors;
Query OK, 0 rows affected (0.007 sec)

[email protected](mwaddlink)> drop table lr_akwiki_pageids;
Query OK, 0 rows affected (0.003 sec)

[email protected](mwaddlink)> drop table lr_akwiki_redirects;
Query OK, 0 rows affected (0.004 sec)

[email protected](mwaddlink)> drop table lr_akwiki_w2vfiltered;
Query OK, 0 rows affected (0.003 sec)

[email protected](mwaddlink)> begin;
Query OK, 0 rows affected (0.001 sec)

[email protected](mwaddlink)> delete from lr_model where lookup='akwiki';
Query OK, 1 row affected (0.001 sec)

[email protected](mwaddlink)> commit;
Query OK, 0 rows affected (0.002 sec)

[email protected](mwaddlink)> drop table lr_nawiki_anchors;
Query OK, 0 rows affected (0.004 sec)

[email protected](mwaddlink)> drop table lr_nawiki_pageids;
Query OK, 0 rows affected (0.004 sec)

[email protected](mwaddlink)> drop table lr_nawiki_redirects;
Query OK, 0 rows affected (0.004 sec)

[email protected](mwaddlink)> drop table lr_nawiki_w2vfiltered;
Query OK, 0 rows affected (0.003 sec)

[email protected](mwaddlink)> begin;
Query OK, 0 rows affected (0.001 sec)

[email protected](mwaddlink)> delete from lr_model where lookup='nawiki';
Query OK, 1 row affected (0.002 sec)

[email protected](mwaddlink)> commit;
Query OK, 0 rows affected (0.001 sec)

[email protected](mwaddlink)>

(note this needed a rolling restart of linkrecommendation to take effect due to filesystem-level caches)