Page MenuHomePhabricator

Implement the reverted edit tag
Closed, ResolvedPublic

Description

This task is for tracking the progress of implementing the reverted tag, as part of my GSoC 2020 project: T248775: Proposal: Add Reverted filter to RecentChanges Filters

Requirements

These are the requirements the solution has to satisfy. This is based on the discussion in T252366: Choose the definition of a "reverted" edit.

What is a reverted edit?

A reverted edit is an edit that was later removed from the article by another edit. We will consider the following cases:

  1. An edit is reverted using the rollback link. This allows reverting one or more top revisions made by a single author quickly.
  2. An edit is reverted using the undo link. This allows reverting a single edit in article's history. The reverting editor has the option to apply additional changes on top of the revert, this still counts as reverting.
    1. An optional enhancement that can be implemented later is handling undoing many revisions using the undoafter parameter. See: T153570
  3. A manual revert is made, i.e. the editor reverts the page to an exact previous state.
    1. This will be done by exact revision SHA-1 matching. As the revision table lacks an index over that field, we will have to restrict the search to a certain radius. It will be probably 15 edits, as suggested by this paper, but that can be also a configuration variable.
    2. See also T154637 for a request for a similar feature.

Other

  • The solution must allow for easy filtering of reverted edits on Special:RecentChanges. This requirement will be satisfied automatically, if we use an edit tag to mark reverted edits.
  • The reverted tag is permanent – under no circumstances will it be removed from an edit by MediaWiki.
  • We should somehow store the information on which method of detecting a revert was used to mark an edit as reverted. This is a request from the Product Analytics team.
  • The solution should cover at least all cases that the Echo extension currently covers for detecting reverted edits. We will want to get rid of redundant revert detection code in Echo.

Technical design

Note that this is still being drafted and may change a lot. I will probably update this section a few times. It may be temporarily complete nonsense.

I think the most sensible way of starting this is by refactoring the code around the PageContentSaveComplete hook, mostly in the PageUpdater class. A new class would be introduced to store information related to an edit, to simplify data handling and related hooks. This was already attempted in this patch. Note that this patch is pre-MCR, so a lot of stuff will have to be changed.
This would also involve changing parameter lists for some hooks and probably deprecating the ArticleRollbackComplete hook (see T154263).

  • The new class (let's call it EditResult for now) would store references to things like the revision we reverted to and revisions that were reverted by it.
  • The class may seem quite similar in role to RevisionRecord, but from what I understand, RevisionRecord should be completely independent of its page and other revisions. Also it would not be possible to trivially reconstruct the new fields from a DB row of the revision table.
  • It may also be tempting to integrate that into the PageUpdater class, as it has a kind of similar role. That would also be bad, as that class is used mostly as a controller for saving edits, not as a container. In the PageContentSaveComplete hook we would have to pass $this to extensions, which is way beyond the scope of the hook.

Then the actual revert detection code would be implemented. Relevant information would be stored in an object of the new class and reverted edits would be tagged appropriately. Additional information about the revert can be stored in the tag using the ct_params field of the change_tags table.

Note about scheduling: according to my original internship schedule in T248775, I would first do the revert detection code and make the programmatic interface later. I think doing it the opposite way would save me some time on refactoring the code later, so I'll want to start with implementing the new class.

Related Objects

StatusSubtypeAssignedTask
ResolvedSBisson
ResolvedMMiller_WMF
StalledNone
DeclinedNone
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel
ResolvedOstrzyciel

Event Timeline

Change 602432 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] PageUpdater: create EditResult class

https://gerrit.wikimedia.org/r/602432

Change 606976 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] [WIP] Modify PageSaveComplete hook to use EditResult

https://gerrit.wikimedia.org/r/606976

Change 606980 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/extensions/AbuseFilter@master] Update PageSaveComplete hook to use EditResult

https://gerrit.wikimedia.org/r/606980

Change 606981 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/extensions/PageTriage@master] Update PageSaveComplete hook to use EditResult

https://gerrit.wikimedia.org/r/606981

Change 602432 merged by jenkins-bot:
[mediawiki/core@master] PageUpdater: create EditResult class

https://gerrit.wikimedia.org/r/602432

Change 606980 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Update PageSaveComplete hook to use EditResult

https://gerrit.wikimedia.org/r/606980

Change 606981 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@master] Update PageSaveComplete hook to use EditResult

https://gerrit.wikimedia.org/r/606981

Change 606976 merged by jenkins-bot:
[mediawiki/core@master] Modify PageSaveComplete hook to use EditResult

https://gerrit.wikimedia.org/r/606976

Change 609230 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] EditResultBuilder: use RevisionRecord::hasSameContent

https://gerrit.wikimedia.org/r/c/mediawiki/core/ /609230

Change 609409 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] WIP: EditResult: add get(Newest|Oldest)RevertedRevisionTimestamp()

https://gerrit.wikimedia.org/r/609409

Change 609230 merged by jenkins-bot:
[mediawiki/core@master] EditResultBuilder: use RevisionRecord::hasSameContent

https://gerrit.wikimedia.org/r/609230

Change 609773 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] WIP: Add mw-reverted change tag

https://gerrit.wikimedia.org/r/609773

Change 609409 abandoned by Ostrzyciel:
[mediawiki/core@master] WIP: EditResult: add get(Newest|Oldest)RevertedRevisionTimestamp()

Reason:
I think it would be better if retrieving the timestamps was extension's duty, as was explained in the commit message.

As an alternative I propose first retrieving the outlier revisions from DB and then querying the range of revisions using RevisionStore::getRevisionLimitConditions (that I propose to make public instead of private, see: https://gerrit.wikimedia.org/r/c/mediawiki/core/ /609773/1/includes/Revision/RevisionStore.php ).

An example of how this would work is shown here: https://gerrit.wikimedia.org/r/c/mediawiki/core/ /609773/1/includes/Storage/RevertedTagUpdate.php#102

https://gerrit.wikimedia.org/r/609409

Change 610828 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[mediawiki/core@master] EditResult: improve documentation

https://gerrit.wikimedia.org/r/610828

Change 610828 merged by jenkins-bot:
[mediawiki/core@master] EditResult: improve documentation

https://gerrit.wikimedia.org/r/610828

Change 623364 had a related patch set uploaded (by Ostrzyciel; owner: Ostrzyciel):
[operations/mediawiki-config@master] Disable the reverted tag on all wikis except testwiki

https://gerrit.wikimedia.org/r/623364

Change 623364 merged by jenkins-bot:
[operations/mediawiki-config@master] Disable the reverted tag on all wikis except testwiki

https://gerrit.wikimedia.org/r/623364

Mentioned in SAL (#wikimedia-operations) [2020-08-31T23:09:35Z] <catrope@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Disable (future) mw-reverted tag for all wikis except testwiki (T254074) (duration: 00m 57s)

Change 609773 merged by jenkins-bot:
[mediawiki/core@master] Add mw-reverted change tag

https://gerrit.wikimedia.org/r/609773

I've added comments (and the testing results for testwiki wmf.8 to T164307)

The scope of this ticket and the list of related task need to be reviewed additionally to decide if all was done and implemented.