See https://office.wikimedia.org/w/index.php?title=User:Ssastry/VE_Test&type=revision&diff=261445&oldid=261444 for the reproducible test
Description
Related Objects
- Mentioned In
- T333673: 'Add links' saves 0 links, makes a dummy edit
- Mentioned Here
- T151367: Investigate the usage of gallery image syntax before enabling HTML editing
T211895: Visual editor shouldn't add the File: prefix when missing in a gallery
T214649: VE's gallery representation differs enough so that selser is never applied?
rEVED26ebdd07960b: Merge "ve.init.mw.MobileArticleTarget: Improve toolbar scrolling behavior on…
rMWa473ba08a21b: Update git submodules
T237040: Visual editor changing automatically something in text while saving
Event Timeline
officewiki is running 1.35.0-wmf.5 (rMWa473ba08a21b) and VisualEditor 0.1.1 (26ebdd0) 14:26, 5 November 2019.
The revert for T237040 was cherry picked to branch wmf/1.35.0-wmf.2 as commit 87ef3e53e533c9565d226e6a48ed70e673d636d1: https://gerrit.wikimedia.org/r/543956
So this issue shouldn't be caused by T237040, AFAICT.
Same problem on enwiki that is running Parsoid/JS. See https://en.wikipedia.org/w/index.php?title=User:SSastry_(WMF)/VE_Test/Sandbox&type=revision&diff=927959688&oldid=927959370
Seems to be present on both enwiki and officewiki, so not a Parsoid/PHP issue.
But doesn't occur in straight wt2wt:
$ echo '[[Media:CBQ_RPO_1938.jpg|caption]]' | bin/parse.js --wt2wt [[Media:CBQ_RPO_1938.jpg|caption]]
VE sends HTML like this back to Parsoid:
<body id=\"mwAA\" class=\"mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output\" dir=\"ltr\" lang=\"en\"><p id=\"mwAg\"><a href=\"./Media:CBQ_RPO_1938.jpg\" rel=\"mw:WikiLink\" resource=\"./Media:CBQ_RPO_1938.jpg\" title=\"CBQ RPO 1938.jpg\" id=\"mwAw\">This is a caption</a></p> <p id=\"mwBA\">xyz</p></body>
while Parsoid's wt2wt has HTML like this at the midpoint:
<p data-parsoid='{"dsr":[0,34,0,0]}'><a rel="mw:MediaLink" href="//upload.wikimedia.org/wikipedia/en/f/fb/CBQ_RPO_1938.jpg" resource="./Media:CBQ_RPO_1938.jpg" title="CBQ RPO 1938.jpg" data-parsoid='{"a":{"resource":"./Media:CBQ_RPO_1938.jpg"},"sa":{"resource":"Media:CBQ_RPO_1938.jpg"},"dsr":[0,34,null,null]}'>caption</a></p>
Not clear where the spaces are coming from in the title; they aren't present in the href, resource or data-parsoid. Only place we have spaces is the title attribute.
I don't know whether it makes any difference, but I'd like to point out that this also happens for media links inside <gallery> tags, see example.