Page MenuHomePhabricator

Create deprecation plan for public parsoid endpoints
Open, HighPublic

Description

Parsoid is exposing a number of public endpoints through RESTbase. With the sunsetting of RESTbase and the integration of parsoid into MediaWiki core, these endpoints should be decomissioned. However, they are still in use by external clients. A deprecation plan is needed.

Usage analysis:

  • /api/rest_v1/page/html/: about 100 req/s https://w.wiki/6YSL
    • main user is WikiMedia enterprise
    • about 8 req/s from REST-API-Crawler-Google/1.0
    • about 14 req/s without user agent, the vast majority originating from just four IP addresses
  • /api/rest_v1/transform/wikitext/to/html about 6 req/s https://w.wiki/6YSK
    • about 2.5 req/s from REST-API-Crawler-Google/1.0
    • about 2.5 req/s from a fake user agent that starts with "User-Agent", originating from a single IP address.
    • another 0.2 req/s from ServiceChecker-WMF/0.1.2
  • /api/rest_v1/transform/wikitext/to/lint about 1 req/min https://w.wiki/6YSR
    • Nearly all of them from the same IP address
  • /api/rest_v1/transform/html/to/wikitext about 1 req/min https://w.wiki/6YSS
    • Most of them from the same IP that also sends the lint requests.

Top users of /api/rest_v1/page/html:

Screenshot 2023-08-25 102139.png (365×893 px, 17 KB)

(numbers are per day, samled 1/128)

Related Objects

StatusSubtypeAssignedTask
StalledNone
In ProgressNone
OpenNone
In ProgressNone
OpenMSantos
ResolvedMSantos
ResolvedMSantos
ResolvedROdonnell-WMF
ResolvedBUG REPORTMSantos
ResolvedBUG REPORTdaniel
ResolvedBUG REPORTdaniel
OpenBUG REPORTNone
InvalidNone
Resolveddaniel
ResolvedBPirkle
ResolvedBPirkle
In Progressdaniel
DuplicateNone
Stalleddaniel
Resolveddaniel
Resolveddaniel
Resolveddaniel
In Progressdaniel
Resolveddaniel
Opendaniel
OpenNone

Event Timeline

I think the basic plan is this:

  • T335512: Talk to Wikimedia Enterprise and get them to transition from the old restbase endpoint to Daniel's new core page html endpoints
  • T335511: Talk to Google and get them to do a similar transition. This *may* require building a new endpoint for /transform/wikitext/to/html since there is no core equivalent for this right now (although there is one exposed via VE's action API)
  • T335513: For the /wikitext/to/lint and /html/to/wikitext endpoints I think @daniel's suggested plan was to turn these endpoints off for increasing periods of time (1hr, 1 day, 1 week) in the hope that will prompt whoever is using these (probably a community bot of some kind) to surface and file a bug / village pump request / etc and then we can properly evaluate the use and come up with a migration plan.
daniel triaged this task as Medium priority.Jun 5 2023, 6:18 PM
daniel moved this task from Unsorted to Parsoid pile on the RESTBase Sunsetting board.

Going to pop in here and stay ahead of breaking things for editors (since that seems to be the plan with T335513).

A script of mine actively uses /transform/html/to/wikitext and /page/html to perform template modifications without having to rewrite a wikitext parser in JavaScript and have it shipped to the browser on every page load. I've been keeping my eyes on RESTBase sunsetting for a long while now, and I have to ask: is there a migration plan for gadget/script developers? I have been unable to find documentation on new endpoints even after combing through all the Phab tasks related to Parsoid and RESTBase, and that really doesn't bode well with the idea of running a scream test to find out what breaks.

mw:Parsoid/API is outdated and doesn't even mention RESTBase deprecation. mw:RESTBase/deprecation and mw:RESTBase/service migration don't mention anything about Parsoid. mw:Manual:Rest.php which leads to mw:Parsoid#Development itself says "production WMF servers do not expose the Parsoid REST API to the external network", and it doesn't seem like that's changed. This ticket (nor does T335513) does not link to documentation of the sort. I know action=parse&parsoid=true exists, but there doesn't seem to be a (documented) way to perform the inverse conversion — HTML to wikitext — at least in the action API. Is there something I can read that will lead me off of the to-be-deprecated endpoints? Do new endpoints even exist? Some clarity would be appreciated.

MSantos raised the priority of this task from Medium to High.Oct 2 2023, 2:48 PM

Quick note: the task description is somewhat outdated, update to come

There's now T373716: Reroute RESTbase Parsoid endpoints to core's REST endpoints, which is another attempt at solving this problem and hints that the progress of having callers move isn't meeting the necessary timeframes. I'll echo what @Chlod said and that IMO the biggest issue has been a lack of communication to clients/library authors on what we should be doing, as well as a lack of replies to questions, e.g. T354037.

There's now T373716: Reroute RESTbase Parsoid endpoints to core's REST endpoints, which is another attempt at solving this problem and hints that the progress of having callers move isn't meeting the necessary timeframes. I'll echo what @Chlod said and that IMO the biggest issue has been a lack of communication to clients/library authors on what we should be doing, as well as a lack of replies to questions, e.g. T354037.

I agree with you, it hasn't been properly communicated yet and that's why a formal communication with guidelines for migration hasn't been published, that's my fault and I apologize for the lack of movement here.

T366835: REST: API modularization and versioning (tracking) is the task that will resolve this blocker because any changes you do with the current API structure would need to change again in the future, so we want to avoid extra work on clients to perform the proper migration.

The plan is to have the proper documentation for the migration work and a reasonable timeline for clients to migrate it, once this task is unblocked.

After some further investigation, I believe this ticket doesn't reflect reality. We already have these endpoints migrated to MW parsoid endpoints, we just need to make sure they are still reflected as experimental (as in restbase) and deprecation can be ruled out.

I apologize for the noise, I'll remove this from blocked for now and once I have a better understanding of the next steps, I'll probably close this ticket as invalid.

My comments above reflect the portion of work to migrate to MW rest endpoints instead of RESTBase deprecation, which is not ready to be done yet and also unrelated to this task.

I agree with you, it hasn't been properly communicated yet and that's why a formal communication with guidelines for migration hasn't been published, that's my fault and I apologize for the lack of movement here.

All good - I think most of us are also interested seeing RESTBase sunset and Parsoid-in-core celebrated, so the more details we have, even if not finalized and informal, the more we can help and collaborate.

After some further investigation, I believe this ticket doesn't reflect reality. We already have these endpoints migrated to MW parsoid endpoints, we just need to make sure they are still reflected as experimental (as in restbase) and deprecation can be ruled out.

Why would these RESTBase endpoints not be deprecated? Aren't they going to go away?

After some further investigation, I believe this ticket doesn't reflect reality. We already have these endpoints migrated to MW parsoid endpoints, we just need to make sure they are still reflected as experimental (as in restbase) and deprecation can be ruled out.

Why would these RESTBase endpoints not be deprecated? Aren't they going to go away?

Eventually, yes, but the endpoints will continue working but will not point to RESTBase. Instead, they will point to REST Gateway, which will directly connect with the services or MW. That won't apply to endpoints that change structure or output or depend on RESTBase internals to function. We are still investigating and learning what to do with these endpoints, case by case, e.g. T374136: Route /api/rest_v1/page/title endpoints to MediaWiki core.