Page MenuHomePhabricator

Massmessage's content handler sometimes parses the delivery list as an empty HTML
Closed, ResolvedPublicBUG REPORT

Description

List of steps to reproduce

  1. Create an empty mass-message delivery list via Special:ChangeContentModel anywhere on the wiki (I used User:Martin Urbanec/Massmessage bug to reproduce)
  2. Go to the newly-created delivery list
  3. Add any user talk page to the delivery list
  4. Refresh the page

See screencast at https://commons.wikimedia.org/wiki/File:Screencast_reproducing_a_massmessage_bug_--_T299896.webm.

What happens?:

The page is empty.

What should have happened instead?:

Editing interface should be shown, like the one below (copied from https://meta.wikimedia.org/wiki/Global_message_delivery/Targets/Empty1):

image.png (451×1 px, 38 KB)

Early investigation notes

I'm 95% sure this is what's happening:

  1. After user adds a user talk page to the delivery list, MassMessage's MassMessageListContentHandler::fillParserOutput is called in a way that makes $cpoParams->getGenerateHtml() return false.
  2. Because of the if at line 219 and else branch at line 238, MassMessageListContentHandler::fillParserOutput returns an empty HTML for this call.
  3. This empty HTML is saved into the parser cache for the revision in question
  4. No actual parsing happens when the user refreshes the page (steps reproduce, point 4); parser cache is consulted, returns an empty (but otherwise valid) HTML and this response is sent to the user.

This is supported by my shell.php fiddling in production:

>>> $page = \MediaWiki\MediaWikiServices::getInstance()->getWikiPageFactory()->newFromTitle(Title::newFromText('User:Martin Urbanec/Massmessage bug'))
=> WikiPage {#3277
      mTitle: Title {#3266
        mArticleID: -1,
        prefixedText: null,
        mDefaultNamespace: 0,
        mRedirect: null,
     },
      mDataLoaded: false,
      mLatest: false,
      mPreparedEdit: false,
   }
>>> $parserOutput = $page->getParserOutput( ParserOptions::newFromAnon() ) # parsercache is consulted here, empty HTML is returned
=> ParserOutput {#510}
>>> $parserOutput->getText()
=> "<div class="mw-parser-output"></div>"
>>>
>>> $parserOutputNoCache = $page->getParserOutput( ParserOptions::newFromAnon(), null, true )
=> ParserOutput {#5791}
>>> $parserOutputNoCache->getText()
=> "<div class="mw-parser-output"><div lang="en" dir="ltr" class="mw-content-ltr"></div><div id="mw-massmessage-addpages"><h2>Add pages</h2><form id="mw-massmessage-addform"><label for="mw-massmessage-addtitle">Title:</label><input id="mw-massmessage-addtitle" name="title"/><label for="mw-massmessage-addsite">Site:</label><input id="mw-massmessage-addsite" placeholder="meta.wikimedia.org" name="site"/><input id="mw-massmessage-addsubmit" type="submit" value="Add page" name="submit"/></form><div id="mw-massmessage-addedlist"><p>Pages added:</p><ul></ul></div></div><h2>Pages in this list</h2><ul><li><span class="mw-massmessage-targetlink"><a href="/wiki/User_talk:Martin_Urbanec_(test)" class="mw-redirect" title="User talk:Martin Urbanec (test)">User talk:Martin Urbanec (test)</a></span><span class="mw-massmessage-removelink">(<a data-title="User talk:Martin Urbanec (test)" data-site="local" href="#">remove</a>)</span></li></ul></div>"
>>>

Once parser cache was left out of the game, correct HTML was returned.

I have very limited knowledge about parsing, so I don't know why $cpoParams->getGenerateHtml() returns false, but it seems to be a plausible explanation of what's happening here to me.

Tagging both MassMessage and MediaWiki-Parser, as fixing this task is likely going to require advanced knowledge about how parsing works in MW, rather than MM itself works (as demonstrated by the "do not use cache" flag above, MM's own code works as expected).

Event Timeline

Hi everyone! Are there any updates on this problem?

Hi, there are no updates, otherwise they would be listed here. :)

Thank you! Was also my way of bumping the ticket, in case it has been forgotten :D.

@Ladsgroup or @roman-stolar do you have any idea what might be going on here, or what recent core changes might cause something like this? (The content model logic in MassMessage hasn't been touched in a while.) I couldn't reproduce the issue on my local test wiki, and I haven't had much luck tracing the code to see what is setting $generateHtml to false.

I found https://gerrit.wikimedia.org/r/c/mediawiki/core/ /713268 which seems like it might be related, and I'm wondering if something might have been broken in the refactor in https://gerrit.wikimedia.org/r/c/mediawiki/core/ /714561, but I'm out of my depth here.

(This issue was also reported at https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Signpost_sign-up_page which was why I started looking.)

My refactor can't be at fault here:

  • It's a very old refactor
  • That function is only false for wikibase data model, not for mass massage.

Sorry, I didn't mean to imply the refactor was at fault, just trying to document what I've looked at in case it was relevant. Do you have suggestions for how to investigate further or reproduce the issue? It seems pretty unlikely that this was broken by a change to MassMessage (the content model logic hasn't meaningfully changed in years: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/MassMessage/ log/refs/heads/master/includes/Content), but I got pretty lost trying to navigate the relevant code in core...

Do you have mediawiki installed locally? You could simply install massmessage and play with the code. It can be also the API call failing (double check it).

I might be able to spend some time on it but I admit, I'm drowning in list of work that needs to be done.

I do have a local test wiki (tracking the master branch), but I wasn't able to reproduce the issue by following the steps in the task description. @Urbanecm if you're able to reproduce the issue locally it would be great if you could share the relevant configuration in your LocalSettings.php (or if you have other idea why this might not be consistently reproducible); thanks!

My theory is that this is caused by https://github.com/wikimedia/mediawiki-extensions-MassMessage/blob/master/includes/Content/MassMessageListContentHandler.php#L238 - calling $parserOutput->setText('') for the case where generateHtml is false. The docs say you should do $parserOutput->setText(null) in that case. (I would also note, that ParserOutput constructor defaults to text being '' if you don't set it at all instead of null, which seems really dangerous and wrong)

However, I have low confidence as I was not able to reproduce this locally. I also wasn't able to come up with a theory of what code flow could be involved. The best I could come up with is that somehow wikidata or MCR config on WMF makes MassMessage page have more than one slot, which triggers a different code path in Revision::RevisionRenderer then what happens locally, which is more dependent on $parserOutput->hasText() giving the right answer. But i'm also pretty sure this pages only have one slot on WMF wikis, so i have no idea.

Second theory - something parses with generate-html false (Similar to what TemplateData does on save, except it only does that wikitext content model), and that parse gets saved in RenderedRevision's revisionOutput at some point. MassMessage sets the text to '' so RenderedRevision thinks that it actually did generate html, so reuses the revision. But I still can't find any plausible code path for that

Change 784650 had a related patch set uploaded (by Brian Wolff; author: Brian Wolff):

[mediawiki/extensions/MassMessage@master] Use $parserOutput->setText( null ) when not generating html

https://gerrit.wikimedia.org/r/784650

Ok, steps to reproduce locally (This is contrived, I don't know if this is what's happening on cluster. However, MediaWiki\Extension\TemplateData\Hooks::onMultiContentSave does something very similar to this, so i believe something along these is plausible):

Add the following to LocalSettings.php

$wgHooks['MultiContentSave'][] = function ( $renderedRevision ) {
	$parserOutput = $renderedRevision->getRevisionParserOutput( [ 'generate-html' => false ] );
};

This causes a parser output with generate-html (but ->hasText() still returning true) to be cached, and the bug happens.

I think the reason this is only happening to MassMessage, is it seems like one of the only content handlers (except maybe wikidata) where both isParserCacheSupported() returns true, and if generate-html is false, it actually listens and doesn't generate html.

Change 784650 merged by jenkins-bot:

[mediawiki/extensions/MassMessage@master] Remove generate-html false optimization

https://gerrit.wikimedia.org/r/784650

T306591 is the followup task for the suspected issue in core.

I think i figured out the actual cause on cluster - SpamBlacklist extension. I believe 3cb265f1 is the change that triggered this issue.

Legoktm subscribed.

Thank you Bawolff and wctaiwan for working on this and figuring it out! Not closing this in case there's still some investigation or follow-up to be done..

Change 785215 had a related patch set uploaded (by Legoktm; author: Brian Wolff):

[mediawiki/extensions/MassMessage@REL1_38] Remove generate-html false optimization

https://gerrit.wikimedia.org/r/785215

Change 785215 merged by jenkins-bot:

[mediawiki/extensions/MassMessage@REL1_38] Remove generate-html false optimization

https://gerrit.wikimedia.org/r/785215

A big thank you to everyone who worked towards resolving this! Thanks so much!

As a feedback, I just tried to add the village pump "Wikipedia:Kuvendi" from sq.wikipedia.org and I keep getting an error saying "title is invalid" and "page doesn't exist". Maybe something to investigate? Under normal circumstance, this tool doesn't even flag invalid namespaces for other users/pages.

A big thank you to everyone who worked towards resolving this! Thanks so much!

As a feedback, I just tried to add the village pump "Wikipedia:Kuvendi" from sq.wikipedia.org and I keep getting an error saying "title is invalid" and "page doesn't exist". Maybe something to investigate? Under normal circumstance, this tool doesn't even flag invalid namespaces for other users/pages.

Which mailing list did you try this on?

If there are other issues we should use a new bug for them.

Afaik all follow up is done now