Page MenuHomePhabricator

Automoderator Mediawiki configuration: Edit summary
Closed, ResolvedPublic

Description

User story: As an administrator, I want to configure the edit summary that Automoderator uses when reverting an edit, so that it is localised to my language and is clear to affected users.

This ticket covers the technical implementation of this configuration and is not concerned with the user-facing interface, which may have different labels and options corresponding to those outlined here.


Name: Edit summary
Options: Free text
Default: Blank


Communities should be able to localise and customise the edit summary Automoderator uses when reverting an edit so that it can reflect or link to local policies, and be in the project's language.

Acceptance criteria

  • When Automoderator reverts an edit, the edit summary it uses should be the text stored in this field.

Event Timeline

Given where this kind of information is going (an edit summary), this isn't just a basic MediaWiki message?

Given where this kind of information is going (an edit summary), this isn't just a basic MediaWiki message?

As far as I understand it could be a MediaWiki message, or it could equally be stored in this field. The benefit of having it alongside the other configuration options is that everything is in one place (including the edit history to track config changes). Do you see benefits to having this be a MediaWiki message instead?

Given where this kind of information is going (an edit summary), this isn't just a basic MediaWiki message?

Do you see benefits to having this be a MediaWiki message instead?

How are you expecting to interface with translate wiki?

Given where this kind of information is going (an edit summary), this isn't just a basic MediaWiki message?

Do you see benefits to having this be a MediaWiki message instead?

How are you expecting to interface with translate wiki?

This is something we need to discuss further (e.g. T352678: How can we enable Automoderator's username to be localisable?) but my preference would be for as much as possible to be localisable in the configuration form rather than TranslateWiki so that a) everything is in one place and b) administrators have as much control and oversight as possible around the elements which are showing up to regular users on the wiki (i.e. username, edit summary)

Probably worth a separate task blocking this one if no others (and it looks like there's another -- though I don't think you can do much about localizable user names....) to discuss that with language - translatewiki teams, with a general overview of what these configs look like and what might need to be translated. I wouldn't expect keys to be translated but values may need to be (and the value for this key or several keys definitely needs to be). Whether Translatewiki can accommodate on a particular timeline whatever is decided seems like a sufficiently large driver of this task.

Samwalton9-WMF renamed this task from Automoderator configuration: Edit summary to Automoderator Mediawiki configuration: Edit summary.Mar 26 2024, 3:55 PM

For our pilot, this can be a string.

In the future, we will want to be able to add variables like the reverted user's username into the edit summary. How we could do that is an open question. Some/many MediaWiki interface strings exist with parameters which can be edited by admins. I'm going to file a spike!

jsn.sherman changed the task status from Open to In Progress.Apr 1 2024, 3:00 PM
jsn.sherman claimed this task.
jsn.sherman moved this task from Ready to In Progress on the Moderator-Tools-Team (Kanban) board.
jsn.sherman subscribed.

For our pilot, this can be a string.

In the future, we will want to be able to add variables like the reverted user's username into the edit summary. How we could do that is an open question. Some/many MediaWiki interface strings exist with parameters which can be edited by admins. I'm going to file a spike!

I was having a look before getting started and because we have this complex matrix of config and i18n, it's worth trying to understand where things should live now.
The most obvious way that I found to have custom per wiki strings that accept arguments is to use the on-wiki message overrides
https://www.mediawiki.org/wiki/Help:System_message#Message_sources
That would mean adding a message string for the automoderator summary to i18n (we could let it go to translatewiki for defaults, or mark it as an ignore). This message would take the original editor as an argument, and our revert code would supply it.
For customizations, community could then just override with their desired text, like these messages:
https://en.wikipedia.org/wiki/Special:AllMessages?prefix=&filter=modified&lang=en&limit=50

That would be a little different than what we envisioned because customization would not be handled in the messages-specific json page instead of the automoderator config json page, but it would be aligned with the way this problem is tackled elsewhere.

FYI: As I was looking through the core code for populating the default undo message, I can see case handling for edits by anonymous users (because of the $wgDisableAnonTalk setting), foreign users from imported edits, and hidden users. I'm planning on rolling the handling of these cases into this patch since there are privacy implications.

For our pilot, this can be a string.

In the future, we will want to be able to add variables like the reverted user's username into the edit summary. How we could do that is an open question. Some/many MediaWiki interface strings exist with parameters which can be edited by admins. I'm going to file a spike!

I was having a look before getting started and because we have this complex matrix of config and i18n, it's worth trying to understand where things should live now.
The most obvious way that I found to have custom per wiki strings that accept arguments is to use the on-wiki message overrides
https://www.mediawiki.org/wiki/Help:System_message#Message_sources
That would mean adding a message string for the automoderator summary to i18n (we could let it go to translatewiki for defaults, or mark it as an ignore). This message would take the original editor as an argument, and our revert code would supply it.
For customizations, community could then just override with their desired text, like these messages:
https://en.wikipedia.org/wiki/Special:AllMessages?prefix=&filter=modified&lang=en&limit=50

That would be a little different than what we envisioned because customization would not be handled in the messages-specific json page instead of the automoderator config json page, but it would be aligned with the way this problem is tackled elsewhere.

Thanks for investigating this! I really want to include as much config as possible in one place so I'm going to prod a little further here. While it's true that interface messages are handled this way elsewhere currently, is there any way we could imagine having messages with arguments in the on-wiki config we're designing here, rather than having it off in an interface message somewhere?

[...]
Thanks for investigating this! I really want to include as much config as possible in one place so I'm going to prod a little further here. While it's true that interface messages are handled this way elsewhere currently, is there any way we could imagine having messages with arguments in the on-wiki config we're designing here, rather than having it off in an interface message somewhere?

[...]
Some/many MediaWiki interface strings exist with parameters which can be edited by admins.
[...]

Can you provide some examples of this? It sounds like there is some prior art to examine.

jsn.sherman changed the task status from In Progress to Open.Apr 3 2024, 6:45 PM
jsn.sherman removed jsn.sherman as the assignee of this task.
jsn.sherman moved this task from In Progress to Ready on the Moderator-Tools-Team (Kanban) board.

setting aside to deal with breakfix

[...]
Thanks for investigating this! I really want to include as much config as possible in one place so I'm going to prod a little further here. While it's true that interface messages are handled this way elsewhere currently, is there any way we could imagine having messages with arguments in the on-wiki config we're designing here, rather than having it off in an interface message somewhere?

[...]
Some/many MediaWiki interface strings exist with parameters which can be edited by admins.
[...]

Can you provide some examples of this? It sounds like there is some prior art to examine.

These are the messages you linked before - the ones with variables contain text like $1 which is filled in by the software.

[...]
Thanks for investigating this! I really want to include as much config as possible in one place so I'm going to prod a little further here. While it's true that interface messages are handled this way elsewhere currently, is there any way we could imagine having messages with arguments in the on-wiki config we're designing here, rather than having it off in an interface message somewhere?

[...]
Some/many MediaWiki interface strings exist with parameters which can be edited by admins.
[...]

Can you provide some examples of this? It sounds like there is some prior art to examine.

These are the messages you linked before - the ones with variables contain text like $1 which is filled in by the software.

Ah, thank you! I thought you might be referring to some other mechanism of which I was unaware.

Okay, it sounds like the spike is still warranted. The way we do this now may end up getting thrown out later, but I don't think that's a big deal. For now I'll implement the config string as requested along with the edge case checks that I mentioned in earlier comments.

Change #1017375 had a related patch set uploaded (by Jsn.sherman; author: Jsn.sherman):

[mediawiki/extensions/AutoModerator@master] [WIP] Automoderator config: Edit summary

https://gerrit.wikimedia.org/r/1017375

We'll need to check this next week, but I added really basic templating in my WIP patch

image.png (114×945 px, 23 KB)

During our review of this ticket today we identified that there are more implications of how we handle edit summaries than previously identified, which have knock-on user-facing implications. I'm going to describe this verbosely here to help me figure out what, if anything, we need to do.

There are two pieces of metadata currently inserted into default edit summaries:

  • The revision ID of the reverted edit (e.g. 123456), which is used to construct a Special:Diff link to the reverted edit.
    • This is $1 below.
  • The username of the reverted editor (e.g. Sam-Walton), which is used to build a link to the reverted user's user and user talk pages.
    • Below, $2 is generally the username (e.g. Samwalton9), but it's a bit different for imported edits. For those, $2 is the language-code-formatted full username title (e.g. fr:User:Samwalton9) and $3 is the locally generated username (e.g. fr>Samwalton9).

Given these pieces of data, there are a number of different cases in user edit reverting behaviour, in which these data need to be used in different ways:

  • The reverted user exists and is logged in to an account (Undo revision [[Special:Diff/$1|$1]] by [[Special:Contributions/$2|$2]] ([[User talk:$2|talk]]))
  • The reverted user exists but is not logged in to an account (Undo revision [[Special:Diff/$1|$1]] by [[Special:Contributions/$2|$2]])
    • This removes the talk page link.
  • The reverted edit was imported to the wiki without assigning the edit to a local user with the same username (Undo imported revision [[Special:Diff/$1|$1]] by user [[:$2|$3]])
    • See this test for an example of what this looks like.
  • The reverted edit was imported to the wiki, and a "valid interwiki prefix of the source could not be ascertained" (Undo imported revision [[Special:Diff/$1|$1]] by user $2)
  • The reverted user's username was deleted prior to the revert being initiated (Undo revision [[Special:Diff/$1|$1]] by a hidden user)
    • Test example here.

We could theoretically (and this is how the patch is currently designed) have users define edit summaries for each of these 5 cases. I don't think that's desirable because understanding and defining each of these summaries doesn't feel like valuably-spent time for moderators, so I'd like to think about how we could simplify this.

I'm not particularly motivated by the idea that anonymous (soon temporary) and logged-in users should be treated differently - the only change made in the default summaries is to remove the talk page link, but in general editors want to be able to communicate with other editor, even logged-out/temporary ones, and there's no technical reason to distinguish these cases as I see it.

Hidden users are the case where there's a clear technical reason we can't/shouldn't include the username. It seems like we would need to define cases both with and without a username at a minimum.

In terms of imported edits there is again a potential special case, since we need to generate an interwiki link. But would Automoderator ever be reverting an imported edit? i.e. do edits generated via Special:Import trigger Automoderator review? If not we could remove these cases from consideration.

If we can treat edits the same between logged in and out, and if we didn't need to worry about imported edits, we would only need two cases - when a username is available and when it isn't.

Thoughts on the accuracy of the above very welcome :)

I'm not particularly motivated by the idea that anonymous (soon temporary) and logged-in users should be treated differently - the only change made in the default summaries is to remove the talk page link, but in general editors want to be able to communicate with other editor, even logged-out/temporary ones, and there's no technical reason to distinguish these cases as I see it.

Edit summaries have a length limit and IPv6 strings of hex numbers hit it when you have to consider the link piping. It's also practically useless to try to talk to the vast majority because they rotate as quickly as they do. Which may or may not be improved by temporary accounts (no one on the project has tried to demonstrate how many temporary accounts will be created in comparison to how many IPs edit though I think there has been an estimate or two).

I'm not particularly motivated by the idea that anonymous (soon temporary) and logged-in users should be treated differently - the only change made in the default summaries is to remove the talk page link, but in general editors want to be able to communicate with other editor, even logged-out/temporary ones, and there's no technical reason to distinguish these cases as I see it.

Edit summaries have a length limit and IPv6 strings of hex numbers hit it when you have to consider the link piping.

Ah I hadn't considered length limits, but from what I can see edit summaries can be 800 characters. An example undo revision for an IPv6 editor, reverting to another edit by an IPv6 (randomly found on enwiki) is Reverted 1 edit by [[Special:Contributions/2604:3D09:117B:C900:8C64:4C9:582E:A497|2604:3D09:117B:C900:8C64:4C9:582E:A497]] ([[User talk: 2604:3D09:117B:C900:8C64:4C9:582E:A497|talk]]) to last revision by 2601:195:C001:2630:C8C6:D3D:326E:7846 which is only 240 characters. Am I missing something?

It's also practically useless to try to talk to the vast majority because they rotate as quickly as they do. Which may or may not be improved by temporary accounts (no one on the project has tried to demonstrate how many temporary accounts will be created in comparison to how many IPs edit though I think there has been an estimate or two).

I actually notice that on enwiki reverts of anon users seem to have the talk page link in them anyway.

Hidden users are the case where there's a clear technical reason we can't/shouldn't include the username. It seems like we would need to define cases both with and without a username at a minimum.

Susana just mentioned to me that we're currently skipping edits where the username/edit has been revision deleted. It makes sense to me, actually, because if revision deletion is happening an admin has looked at the edit and is already taking moderation actions on it - if the edit should be reverted we can trust that they're also going to take that action.

So maybe we only need one (or two, pending the discussion above about anon users) summary?

Sample edit

Don't forget Unicode has other plans for the length of other character sets. :)

We also encourage some substance in undo edits beyond just the boilerplate.

What I don't recall is the summary limit being that high, even post-work done to support Unicode scripts better. They sometimes had a bare 64 characters worth of support in their summaries.

Sample edit

What I don't recall is the summary limit being that high, even post-work done to support Unicode scripts better. They sometimes had a bare 64 characters worth of support in their summaries.

https://meta.wikimedia.org/wiki/Help:Edit_summary#Properties says 800 characters, though I think you're right it used to be shorter. T6715 might be where it was changed, from a quick search.

In terms of imported edits there is again a potential special case, since we need to generate an interwiki link. But would Automoderator ever be reverting an imported edit? i.e. do edits generated via Special:Import trigger Automoderator review? If not we could remove these cases from consideration.

We currently implement the RevisionFromEditComplete hook, which imports also implement via the ImportReporter callback:
https://doc.wikimedia.org/mediawiki-core/master/php/classImportReporter.html

The RecentChangeSave hook (which we may switch to), is implemented by the rebuild recent changes script:
https://www.mediawiki.org/wiki/Manual:Rebuildrecentchanges.php

Meaning that if we wish to skip imported revisions, we need to add that to the precheck.

[...]

Susana just mentioned to me that we're currently skipping edits where the username/edit has been revision deleted. It makes sense to me, actually, because if revision deletion is happening an admin has looked at the edit and is already taking moderation actions on it - if the edit should be reverted we can trust that they're also going to take that action.

[...]

Looking at the code, I think these are two slightly different checks. The precheck is seeing if there is a user identity associated with the revision, the edit summary check is seeing if there is an accessible username associated with the user identity. I don't know what the practical distinction is at the moment.

[...]

Susana just mentioned to me that we're currently skipping edits where the username/edit has been revision deleted. It makes sense to me, actually, because if revision deletion is happening an admin has looked at the edit and is already taking moderation actions on it - if the edit should be reverted we can trust that they're also going to take that action.

[...]

Looking at the code, I think these are two slightly different checks. The precheck is seeing if there is a user identity associated with the revision, the edit summary check is seeing if there is an accessible username associated with the user identity. I don't know what the practical distinction is at the moment.

The user check is done in the hook even before the pre-check, so there may be a chance that the code will not get that far into creating an edit summary.

[...]

Susana just mentioned to me that we're currently skipping edits where the username/edit has been revision deleted. It makes sense to me, actually, because if revision deletion is happening an admin has looked at the edit and is already taking moderation actions on it - if the edit should be reverted we can trust that they're also going to take that action.

[...]

Looking at the code, I think these are two slightly different checks. The precheck is seeing if there is a user identity associated with the revision, the edit summary check is seeing if there is an accessible username associated with the user identity. I don't know what the practical distinction is at the moment.

The user check is done in the hook even before the pre-check, so there may be a chance that the code will not get that far into creating an edit summary.

I went on a rather long goosechase and couldn't come up with a situation where the username would be null. I looked at this again, and y'all are both right about this: it's not checking for the presence of a User::getName() return value, it's checking for a user and then using the getName() return value. This should be absolutely covered in the hook guard clause and the signature for the Revision check; I'll strip it out!

In terms of imported edits there is again a potential special case, since we need to generate an interwiki link. But would Automoderator ever be reverting an imported edit? i.e. do edits generated via Special:Import trigger Automoderator review? If not we could remove these cases from consideration.

We currently implement the RevisionFromEditComplete hook, which imports also implement via the ImportReporter callback:
https://doc.wikimedia.org/mediawiki-core/master/php/classImportReporter.html

The RecentChangeSave hook (which we may switch to), is implemented by the rebuild recent changes script:
https://www.mediawiki.org/wiki/Manual:Rebuildrecentchanges.php

Meaning that if we wish to skip imported revisions, we need to add that to the precheck.

I'm actually going to move ahead with my own suggestion and add imported revisions to the precheck; It's well outside our pilot use case and could have its own complications (e.g. what if someone imported a set of revisions that already included automoderator reverts and had automoderator running during the import?)

In terms of imported edits there is again a potential special case, since we need to generate an interwiki link. But would Automoderator ever be reverting an imported edit? i.e. do edits generated via Special:Import trigger Automoderator review? If not we could remove these cases from consideration.

We currently implement the RevisionFromEditComplete hook, which imports also implement via the ImportReporter callback:
https://doc.wikimedia.org/mediawiki-core/master/php/classImportReporter.html

The RecentChangeSave hook (which we may switch to), is implemented by the rebuild recent changes script:
https://www.mediawiki.org/wiki/Manual:Rebuildrecentchanges.php

Meaning that if we wish to skip imported revisions, we need to add that to the precheck.

I'm actually going to move ahead with my own suggestion and add imported revisions to the precheck; It's well outside our pilot use case and could have its own complications (e.g. what if someone imported a set of revisions that already included automoderator reverts and had automoderator running during the import?)

Agreed with this approach - from what I can see it's generally only admins who can import a page anyway, so importing vandalism seems unlikely even not considering the weirdness of checking each of a set of imported edits.

Given this, the skip for revision-deleted edits, and seemingly no good reason to differentiate anon edits, I think we're back to a single edit summary?

[...]

Given this, the skip for revision-deleted edits, and seemingly no good reason to differentiate anon edits, I think we're back to a single edit summary?

We should keep distinct behavior for anonymous edits in order to respect a wiki's DisableAnonTalk config, which is how autogenerated edit summaries are handled elsewhere. If anontalk is enabled on a wiki, the regular summary gets used.

As an aside, I tested super long usernames and they cause the edit summary to be truncated, eg.:

image.png (160×639 px, 34 KB)

[...]

Given this, the skip for revision-deleted edits, and seemingly no good reason to differentiate anon edits, I think we're back to a single edit summary?

We should keep distinct behavior for anonymous edits in order to respect a wiki's DisableAnonTalk config, which is how autogenerated edit summaries are handled elsewhere. If anontalk is enabled on a wiki, the regular summary gets used.

Oh interesting - is DisableAnonTalk used on any production WMF wikis? I don't see it in CommonSettings or InitialiseSettings but not sure if I should be looking elsewhere.

As an aside, I tested super long usernames and they cause the edit summary to be truncated, eg.:

Thanks - we should think about character limits in the Community Configuration form.

[...]

Oh interesting - is DisableAnonTalk used on any production WMF wikis? I don't see it in CommonSettings or InitialiseSettings but not sure if I should be looking elsewhere.

[...]
I am not aware of it's use on any of our wikis.

[...]

Oh interesting - is DisableAnonTalk used on any production WMF wikis? I don't see it in CommonSettings or InitialiseSettings but not sure if I should be looking elsewhere.

[...]
I am not aware of it's use on any of our wikis.

I searched around Meta, MediaWiki.org, and Phabricator and also can't find any indication of it being used. Since Automoderator only needs to support production wikis can we ignore it?

[...]

Oh interesting - is DisableAnonTalk used on any production WMF wikis? I don't see it in CommonSettings or InitialiseSettings but not sure if I should be looking elsewhere.

[...]
I am not aware of it's use on any of our wikis.

I searched around Meta, MediaWiki.org, and Phabricator and also can't find any indication of it being used. Since Automoderator only needs to support production wikis can we ignore it?

Effectively, yes. We'll handle the case in code, but our AutoModeratorUndoSummaryAnon setting will go unused unless some wiki disables anon talk in the future.

Change #1017375 merged by jenkins-bot:

[mediawiki/extensions/AutoModerator@master] Automoderator config: Edit summary

https://gerrit.wikimedia.org/r/1017375