The consensus on the current state of the Graph extension seems to be that it would be irresponsible to leave it enabled in its current form where it's embedded in article wikitext and any editor can make it execute an arbitrary graph specification. Vega (the specification language the extension relies on) wasn't really meant to be safely usable with untrusted, user-generated specification, and there have been a number of XSS vulnerabilities that are easy to exploit when it's used like that. T335450: Decide whether to use vega-interpreter package to avoid runtime eval() doesn't seem to fundamentally change that, and T222807: Sandbox Graph extension into an iframe isn't something we want to trust the safety of Wikimedia users on.
One way forward would be to treat Vega specifications like other dangerous content (Javascript, CSS) and restrict editing it to a small, trusted set of users. More specifically:
- add some new parameters to the <graph> tag so it can specify a graph template and a graph data source: <graph template="MediaWiki:Piechart.vega" data="commons:Data:GDP.tab" /> (straw proposal, details to be figured out - the data could come from the contents of the tag etc).
- disable the <graph> tag when not used with those parameters
- the template parameter needs to point to a MediaWiki namespace page with some unique naming pattern (*.vega? Vega-*.json?); the editing of such pages would be restricted to users with the vega-editor right, which would initially only be available to trusted users (admins? interface admins?)
- that page would be a graph template: JSON page matching the Vega spec, with some carefully restricted substitution mechanism for plugging in data, labels, formatting etc. from wikitext authored by less trusted users
- the other parameters would provide the part of the spec to be substituted (data, maybe i18n, maybe things like colors or labels although those all could probably also be provided as data streams)
- there would be some guidance to vega-editors similar to the one for interface administrators, to ensure they manage Vega specifications carefully and don't include anything that could result in data being executed as code.
Pro
- This seems relatively easy to keep secure, as long as we trust users with the vega-editor right to know what they are doing (and we already have other user groups with similarly sensitive abilities) - admittedly the part about parameter substitution is very handwavy, but I think it can be figured out.
- Declaring data in a way that's understood by the MediaWiki parser would be a huge step in a more maintainable direction for the extension; data could be cached and invalidated when needed (right now the graph just makes user-defined API requests for the data from readers' browsers, which is a bit of a scalability nightmare), data sources would play nicely with Special:WhatLinksHere.
- With the Vega spec being separate from wikitext, it would be easy to provide a dedicated developer experience (syntax highlighting, TemplateSandbox-style functionality for testing changes, maybe using the Vega editor).
- If we chose to resurrect graphoid (admittedly this is unlikely to happen), it would probably get rid of the most problematic aspects of the MediaWiki -> graphoid -> MediaWiki dependency loop (see discussion in T211881: graphoid: Code stewardship request).
Con
- It would be harder to gain the ability to edit graph specifications (ie. anyone could change the data in a graph but few people could do fundamental changes to how the graph looks or behaves). I don't think this is a big problem - Vega is not a very user-friendly language so not many people edit these anyway, and they should easily be able to get permission.
- The specification would be less flexible - right now you can use templates or Lua to make the specification dynamically depend on the parameters of the graph, this would become impossible. Other than the data and maybe a few other predefined modification points, specifications would be static. Not sure if this would significantly impact real-world usage.
- In the past there was something like this (a dedicated namespace for graph definitions) and it got rolled back: T98365: Add / use a Graph namespace, T124747: Deprecate use of Graph namespaces. Haven't found any discussion on that but maybe there was a good reason they didn't work well?
See also
- Creation of separate user group for editing sitewide CSS/JS for a similar user rights permission in the past
- T155813: Decide on storage and delivery method for TemplateStyles CSS for a detailed discussion of pros/cons of templates vs. dedicated pages in a somewhat similar situation