Page MenuHomePhabricator

[GTWL] Investigate the technical requirements for mention failure notifications
Closed, ResolvedPublic8 Estimated Story Points

Description

We want to find out

  • Is it possible to find out which mentions have been sent and which haven't?
  • Is it possible to get the reasons why a mention has not been sent?
  • Is it possible to notify the sender after the mentions have been processed, with the mention's status / error messages as well as a pointer to where the mention was made?
  • Is there work that would need to be done before we can start implementing this story?

Expected outcome:
An idea how to implement mention notifications, with estimates how long this would take

Event Timeline

Tobi_WMDE_SW set the point value for this task to 8.May 19 2016, 2:29 PM
Tobi_WMDE_SW moved this task from Proposed to Backlog on the TCB-Team-Sprint-2016-05-19 board.

Technical breakdown

So the mentions are generated from the Echo extension (which we already knew) from within the onArticleSaved hook (so whenever an edit is made).
The events are generated as a deferred update which means they will likely complete after the HTTP request completed and the page has been served to the user.

Mentions are then generated in DiscussionParser::generateMentionEvents.

			// we should not add user to 'mention' notification list if
			// 1. the user name is not valid
			// 2. the user mentions themselves
			// 3. the user is the owner of the talk page
			// 4. user is anonymous

Also

			// If more than 50 users are being pinged this is likely a spam/attack vector
			// Don't send any mention notifications.

Echo events are created for the mentions using EchoEvent::create
Here two hooks could be of use, BeforeEchoEventInsert and EchoEventInsertComplete in between which the event is inserted into the echo_event table.

EchoNotificationController::notify is then called with $wgEchoUseJobQueue deciding if the actual notification is deferred to the job queue or not.
This in turn can generate a 'web' notification and or an 'email' notification.
For 'web' EchoNotifier::notifyWithNotification is called with a single user and a single event.
If the type of notification is allowed by 'web' (in this case 'notify') then EchoNotification::create is called.
This then adds a new row to the echo_notification table!

We want to find out
Q: Is it possible to find out which mentions have been sent and which haven't?
A: Yes, it is possible to find out to whom notifications have been sent about events using a simple DB query. Although this is not an instant thing and due to the use of the job queue notifications could take seconds to send or maybe even hours. It should also be noted that notifications are removed from the notification table after a time, I think the maximum is 100 notifications.

Q: Is it possible to get the reasons why a mention has not been sent?
A: Well, the the points in the code where the errors happen of course. In the case of creating the mention event, the event will not be created if more than 50 notifications will happen, but as said in the technical walk through above this happens after the save page page has already been served to the user so probably not useful for us.

Q: Is it possible to notify the sender after the mentions have been processed, with the mention's status / error messages as well as a pointer to where the mention was made?
A: It would probably be possible to create some sort of 'mention-fail' echo event which could in turn we sent to the use that actually made the mention. It would probably be possible to do this for most errors? Allot of thought should likely be put into this if this is a decided path.

Q: Is there work that would need to be done before we can start implementing this story?
A: This entirely depends on how we want to implement the story! And also what work we would count as implementing the story. Right now the two possible solutions that jump into the front of my mind are:

  1. Create some page where people can go to to check the state of notifications for an echo event? Although providing the user with a link to this page on page save again is hard due to deferred updates.
  2. Create an echo event on failure of the mention echo event. (This probably has 2 granularities, failure of the whole echo event so all mentions, or failure on a single notification)
  3. Log failures of these things on Special:Log. Mentions have a Performer (the person that made the edit that mentioned people) and Mentions have a Target (the page that the mention occurred on). I see no reason we couldn't have a Mention Log of some sort that would log failed mentions.

Looking outside of the scope of this task slightly it may also be nice to have a general special page where people can see echo events, look at the states of notifications for those events etc. Although this may not even make sense for some types of events etc.
It is probably worth chatting to some people that have worked on Echo about this.

Change 290062 had a related patch set uploaded (by Addshore):
Track why mentions didn't happen

https://gerrit.wikimedia.org/r/290062

Addshore renamed this task from [GTWL] Investigate the technical requirements for mention notifications to [GTWL] Investigate the technical requirements for mention failure notifications.May 23 2016, 10:47 AM

Change 290198 had a related patch set uploaded (by Addshore):
DRAFT Echo notifications on mention failure

https://gerrit.wikimedia.org/r/290198

So after doing some further investigation I came up with the patch linked above which essentially introduces a rough shell of the requested feature.

Further questions include:

  1. Which error cases should the user be notified about (right now I have stuck with when too many people have been mentioned but we may want to cover more cases)
  2. What icon should the notification have? (Right now I am using the same icon as the mention notifications)
  3. Do we want these notifications to be available via email? (Right now I have turned this off).
  4. What do we want the messages to be / includes? (See the new messages introduced in https://gerrit.wikimedia.org/r/#/c/290198/2/i18n/en.json )
  1. Which error cases should the user be notified about (right now I have stuck with when too many people have been mentioned but we may want to cover more cases)

If at all possible, the signature failure case (must include a valid signature in the same edit) would be helpful to include, as new(-to-Echo) users often get confused by that.

@Catrope had concerns why the plan is to use echo notifications instead of adding it to the save notice. @Addshore could you explain our reasoning behind that?

@Catrope had concerns why the plan is to use echo notifications instead of adding it to the save notice. @Addshore could you explain our reasoning behind that?

In Echo/Hooks.php

		// Try to do this after the HTTP response
		DeferredUpdates::addCallableUpdate( function () use ( $revision ) {
			EchoDiscussionParser::generateEventsForRevision( $revision );
		} );

So events for revisions are currently done after a HTTP response is served.
So adding messages as to the page loading after a save would mean creating events before serving the HTTP response meaning longer page loads.
Of course, if this processing was no longer deferred then the first set of messages / errors (1,2,3,4 and more than 50 users) in my above comment could be served on the save page.

Change 290062 merged by jenkins-bot:
Track why mentions didn't happen

https://gerrit.wikimedia.org/r/290062

@Catrope had concerns why the plan is to use echo notifications instead of adding it to the save notice. @Addshore could you explain our reasoning behind that?

In Echo/Hooks.php

		// Try to do this after the HTTP response
		DeferredUpdates::addCallableUpdate( function () use ( $revision ) {
			EchoDiscussionParser::generateEventsForRevision( $revision );
		} );

So events for revisions are currently done after a HTTP response is served.
So adding messages as to the page loading after a save would mean creating events before serving the HTTP response meaning longer page loads.
Of course, if this processing was no longer deferred then the first set of messages / errors (1,2,3,4 and more than 50 users) in my above comment could be served on the save page.

Good point. I suppose if we wanted to do a save notice, we'd have to refactor this code so that getting the list of mentioned users is not deferred (building and sending the mention events could still be deferred), and that would make saves slower.

I think that https://www.mediawiki.org/wiki/User:Legoktm/pings is probably a better long-term strategy.

Summarizing last Thursday's meeting: @Addshore will look into the signature failures:

  • signature was embedded in a template
  • signature is not the same as ~~~ or ~~~~would produce (or not existing)
  • the chunk contained a signature not from self

@Catrope could you help us out with answers to a few questions we have regarding the reason for mention failures as listed in [1]?

  • The diff chunk must be recognised as an addition of new lines of text, not a change to existing lines.: Apparently, adding a signature or a mention is not recognized as "something new". Why is it not possible to let somebody know about a change?
  • The comment must either belong in its entirety to existing sections[...], or start a new section: Why? Is it for knowing exactly where to link to?
  • sufficiently tricky markup can trigger bogus results: could you give us an example for that?

[1] ] https://www.mediawiki.org/wiki/Manual:Echo#Technical_details

@Catrope could you help us out with answers to a few questions we have regarding the reason for mention failures as listed in [1]?

  • The diff chunk must be recognised as an addition of new lines of text, not a change to existing lines.: Apparently, adding a signature or a mention is not recognized as "something new". Why is it not possible to let somebody know about a change?
  • The comment must either belong in its entirety to existing sections[...], or start a new section: Why? Is it for knowing exactly where to link to?
  • sufficiently tricky markup can trigger bogus results: could you give us an example for that?

These things are all from well before my time, so I don't know exactly why these limitations are there. I suspect that most of them are heuristics to try to weed out cases where someone adds a link to a user page in a context that's not a discussion. It's not as simple as checking whether the page is in a talk namespace, because there are pages in non-talk namespaces (e.g. Wikipedia:Village pump) that are used as discussion pages.

In the short term, we can loosen these requirements, so long as we don't cause lots of false positives (i.e. people getting notified because someone linked to their user page somewhere). We could also have looser requirements (or none at all) in talk namespaces, but that might be confusing (because then something that works as a mention on Talk:Foo doesn't work on Wikipedia:Articles for Deletion/Foo).

In the long term, I think the best way forward is to do something like what https://www.mediawiki.org/wiki/User:Legoktm/pings proposes: introduce new syntax for mentions (e.g. {{ping:Lea}}) so we don't have to try to figure out if [[User:Lea]] was intended to be a mention or just a link.

Due to concerns by the Notifications team about positive mention notifications, the current plan is to send out mention failure notifications instead

Change 290198 abandoned by Addshore:
DRAFT Echo notifications on mention success

Reason:
This was only part of an investigation https://phabricator.wikimedia.org/T135719#2382798

https://gerrit.wikimedia.org/r/290198