We are looking for ways of improving the handling of common content structures like infoboxes, data tables, citations or navboxes. We'd like to
- make their rendering more adaptable to different devices and use cases,
- improve the editing experience, and
- support the integration of different data sources.
Wikia and others have already done significant work on infoboxes (see links below). The details and considerations around those implementations are not the focus of this task, as the primary focus here is on identifying a minimal supporting infrastructure that we need to enable such implementations.
Page components
Page components are not much more than bits of well-formed HTML along with some metadata. Well-formed HTML along with attribute markers lets us cleanly swap out a component's rendering. Metadata documents each component's dependencies, so that we can propagate changes efficiently and reliably. Other metadata like page properties and render dependencies need to be aggregated when composing a larger page, so that ResourceLoader modules for example can be loaded for the entire page. Candidates for per-component metadata are:
- ResourceLoader modules needed to render the component.
- Resources used to render this component, for dependency tracking.
- templates and Scribunto modules
- images
- wiki pages (for link rendering)
- external data sources
- Page metadata like
- categories
- external links
- magic word flags
- Caching / storage limitations, for dynamic content
- Other page properties
For wiki content processed in the PHP parser, basically all of this information apart from external data sources is available in the ParserOutput object. Some of this information is already exposed in the expandtemplates end point (notably missing are the list of sub-templates used), and even more is available in the parse end point.
Currently this metadata is mostly implementation-defined, and not consistently exposed in the Action API. The proposal is to document standard metadata and its semantics, and make sure that this is consistently exposed through APIs in a way that lets clients aggregate information in a generic manner (as sets, for example), without having to know about each possible bit of metadata explicitly.
Benefits of doing this include:
- Finer-grained dependency tracking, which in turn can make updates like refreshLinks a lot more efficient.
- More accurate page metadata tracking in VisualEditor when inserting / removing components.
- Efficient updating of ResourceLoader modules and other dependencies when re-rendering individual components for a different context.
- Opening up the possibility of implementing page components in separate services.
Questions
- Can we restrict page components to a single DOM node?
- Can we come up with a sensible aggregation of metadata that satisfies different use cases well?
- Idea: Two blobs (or one with two sub-objects), one for view-relevant data (modules, categories, magic word flags?), one for more verbose data like dependencies.
- could also consider exposing modules / view metadata in HTTP header
- Idea: Two blobs (or one with two sub-objects), one for view-relevant data (modules, categories, magic word flags?), one for more verbose data like dependencies.
- Component addressing and generic parameter encoding
- Can we generalize the process of figuring out how to render / re-render a component?
Current work and background reading
- Declarative infoboxes at Wikia: The Wikia folks are gradually replacing infobox templates with widgets, by replacing top-level infobox templates with an <infobox> tag extension wrapping an XML infobox definition. They are doing this in cooperation with the community, and provide migration tools based on heuristics on template parameters and typical values (ex: parameters whose value normally starts with Image: are rendered as images). The primary focus is on moving towards a declarative infobox widget definition, as a first step towards inline editing and flexible styling across different devices.
- Wikidata-generated infoboxes by @Jdlrobson, edit interface
- Capiunto: using Scribunto to render infoboxes from wikidata
- https://www.mediawiki.org/wiki/Parsoid/Content_widgets: Older notes from the Parsoid team
- https://www.mediawiki.org/wiki/Parsoid/DOM_notes: Parsoid notes on self-contained templates and content model constraints
- T103630: Semantic content blocks
- T103624: Semantic media roles
- T118517: [RFC] Use <figure> for media and T118520: Use <figure-inline> instead of <span> for inline figures.
- Templates are dead! Long live templates! -- presentation by @cscott at Wikimania 2015
- T57524: Enforce proper nesting of most templates, and encapsulate compound content blocks
- @ssastry commenting on encoding syntactical and content model constraints in templates