This tag is used to group security bugs by their general classification, in this case Cache Pollution.
Parent project: Security-Team
This tag is used to group security bugs by their general classification, in this case Cache Pollution.
Parent project: Security-Team
Thanks for this @sbassett and apologies for not cleaning the house ourselves!
Note: I committed the deletion of the two wmf.28 Wikibase patches under /srv/patches on the deployment server (5578144525) since wmf.28 was rolled back and as noted by gerritbot above, https://gerrit.wikimedia.org/r/658323 and https://gerrit.wikimedia.org/r/658324 were merged.
Change 658324 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] SECURITY: Add job to purge entity data on page deletion
Change 658323 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] SECURITY: Add EntityDataPurger
The two security patches are now on Gerrit: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/ /658323 and https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/ /658324. We can merge them so that they’ll go out with the regular train, and won’t need to be applied as security patches anymore.
The issue has been successfully patched on Wikidata. To our best knowledge the problem does not pose a security risk to Wikibase installation outside of WMF production wiki. Therefore we make the issue, and the fix, public.
I think this should not linger in production like that as more issues are bound to happen. I make this a blocker of the next train to make sure it doesn't get forgotten by the next branch cut.
I’ve looked at the applied commit on deploy1001 (/srv/mediawiki-staging/php-1.36.0-wmf.27/extensions/Wikibase, commit e4054597c9) and it looks fine to me. I’m guessing that some other change on master touched an adjacent hooks line in the extension JSON file; in that case, a 3-way merge is probably semantically correct.
Thanks. This seems to be a problem. I check to see what's the blocker of making this public, it should not stay like this for that long time.
Noting that 01-T260349.patch no longer applies without 3-way merge:
Once this task is made public, we should leave a comment at T128667: Special:EntityData with flavor is cached but not purged properly. It’s not clear to me whether that task is fully resolved with the changes here – it asks for purging “when action=purge is issued”, and we now purge Special:EntityData on revdel/pagedel and on /wiki/Special:EntityData/Q123?action=purge, but not on /wiki/Q123?action=purge.
In T260349#6648544, @Lucas_Werkmeister_WMDE wrote:T260349-2-alt.patch17 KBDownload
tested locally that HtmlCacheUpdater::purgeUrls() is called by adding the following line
That means we must ensure that our code runs after the main transaction has committed, and then also still wait for replication. I think there are several ways to do the first part – a post-send (i.e. default) deferred update, onTransactionCommitOrIdle(), probably some more I’m not aware of – but a job should take care of that part as well, so it’s probably best to go for the job solution. (Even if jobs are run as part of the web request, they run after the main transaction commit, as far as I’m aware.)
@ItamarWMDE noticed that the test suite failed with that change, so I uploaded a MediaWiki core change to fix that: https://gerrit.wikimedia.org/r/c/mediawiki/core/ /643496. The following version of the patch file just adds a comment to the commit message and streamlines the test code a bit; no non-test code changes.
Alright, the following patch implements the job solution:
Nice. We discussed the "job solution" a bit and in my understanding it would both cater to the performance concerns doing all this work during the web request, as well as making the topic area of waiting for replication largely irrelevant. Win win (at quite a bit of effort and added complexity).
Okay, I think I made another false assumption – that endAtomic() commits the transaction, so that it’s now ready for replication. That’s not the case: in general, an endAtomic() may commit the whole transaction (if it’s the outermost atomic level and the whole transaction was started by that atomic section), but in a normal web request, IIUC the most it can do is release a savepoint – the commit only happens later, when the whole request is done. (And since the startAtomic() call here didn’t request the section to be cancelable, and the default is ATOMIC_NOT_CANCELABLE, there shouldn’t even be a savepoint to release.) So this means that with the patches from T260349#6641445, the waitForReplication() didn’t help because the transaction to create the archive rows hadn’t committed yet, and therefore none of the replicas were allowed to see those rows.
Weird, the waitForReplication() didn’t help either. (I also tried it without the 'domain' part by commenting out that lineon mwdebug1001 directly.) I think it’s time for me to step back for a bit…
New patches with and without debug logging:
I suspect that our assumption – that, by the time the ArticleDeleteComplete hook runs, the archived rows have been committed and replicated – is incorrect.
Yup, with DB_MASTER it works. (See the logs if you want to – most important is “purge 9 URLs”.) I’ll undeploy the change again so we can figure out how to properly solve this in normal campsite work (because reading an unbounded number of rows from the master is probably not ideal).
Excerpt from the log messages:
Alright, here’s a version of the patch with some debug logging added:
I deployed this to wmf.18, but it didn’t seem to work, so I removed it again. I’ll try to test this locally later, and if I can’t figure out what’s going wrong, see how I can add some debug logging.
Ahead of the deployment, I’ve verified that if I undelete the specific revision with the rebel base’s location, then get the entity data (to make sure it’s cached), and then delete the whole item, the cached data stays available. I’ll use the same item (Q212688) to test the deployment later.
Looks good to me; I think we can deploy it on Monday (scheduled for the EU window). The change seems to apply without conflict to wmf.16 too, so we can probably backport to both branches, since wmf.18 currently isn’t fully rolled out.
Additional patch to purge urls of deleted items. based on previous security patch:
A note from task inspection: We might be able to reuse some code to dig up revisions out of the archive table from the changes related to T242164: Retract revdel'ed Wikidata edits from Wikibase client watchlists.
Hm, I think we didn’t implement any entity/page deletion handling yet? It’s in the task title and AC, but maybe we forgot about it while working on the task (or at least I forgot, I believe).