The [EditAttemptStep schema](https://meta.wikimedia.org/wiki/Schema:EditAttemptStep) (previously the [Edit schema](https://meta.wikimedia.org/wiki/Schema:Edit)) has been around since 2013 and provided data for important projects like the various rollouts of the visual editor.
However, over that time it has also accumulated many bugs, oddities, and unanswered strategic questions. This task tracks the resolution of those issues, and will be complete when (eons hence) the schema's stakeholders agree on its purpose and scope, the schema has been modified to implement that purpose and scope, and when all the necessary implementations conform to that schema.
= Use cases =
* Edit completion rate
* Time to loaded and time to interactive
* Overall edit duration
= Which editor interfaces should this cover? =
* visual editor (phone and desktop)
* 2010 wikitext (phone and desktop)
* 2017 wikitext
* ContentTranslation?
* App editors?
As of July 2019, both ContentTranslation and the Android app are actually logging data to the schema, but without the proper values of `platform` and `editor_interface` to distinguish them from desktop VE and the mobile wikitext editor, respectively.
= Scope =
Does it make sense to have one schema that covers all our interfaces? Should the app editors use this schema? What about the Flow editor? What about the Wikidata description tool on Android?
* Having a common schema increases the probability that you can get comparable data across all these interfaces (because it forces teams to collaborate), but it doesn't ensure it.
* We should only incur the collaboration overhead if the benefits of more comparability are worth it—there's not much point in comparing, say, edit completion rate across Wikidata description editing with general page editing, because their contexts are so very different.
= Session identification =
* The schema defines `editingSessionId` as "a string of 32 alphanumeric characters, unique to the current page view session; used for grouping events".
** [mw.user](https://doc.wikimedia.org/mediawiki-core/master/js/#!/api/mw.user) provides a number of different methods for generating session IDs.
** MobileFrontend [uses sessionId()](https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/blob/304c176f291b68dd300448d12d7133236ec5c8cc/resources/mobile.startup/user.js#L64).
** The visual editor [uses generateRandomSessionId()](https://github.com/wikimedia/mediawiki-extensions-VisualEditor/blob/4a41a1aa4c6c98465bf4818ed8b8dad621c2712f/modules/ve-mw/init/ve.init.mw.trackSubscriber.js#L20)
** The 2010 wikitext editor [uses MWCryptRand::generateHex(32)](https://github.com/wikimedia/mediawiki-extensions-WikiEditor/blob/38a70500a0d26f10e32df0ed8924f4c10277e3f6/includes/WikiEditorHooks.php#L270).
* Our current implementation of editing sessions is tightly coupled to a page view. However, this doesn't map very well to what we think of as a single edit session: on desktop, switching between the visual editor and the wikitext editor while retaining changes causes a new page view, while on MobileFrontend, aborting an edit using the back button and then reopening the editor (which doesn't preserve your changes) all happens in one page view.
* We don't use the core EventLogging code for client-side session token generation and sampling.
= Timings =
* There's no reason we should have a separate timing field for each event type when we can have a single one whose meaning varies by event type (T207803#4790039)
* `init_timing` currently not logged, but the information described in the schema ("timing information about action=init – time in milliseconds since the page was loaded") does not seem useful.
= Other issues =
* The new ability to switch back-and-forth between the visual editor and wikitext invalidates some key assumptions (for example, we probably want to update `action.init.mechanism`)
* How should we account for "micro-editing experiences" like Flow? Should they be included in this schema at all?
* Even with T124676 resolved, the table is still quite large. Consider whether to drop mostly unused fields like `page.title` or normalize the schema (T123958)
* Do our `action.saveFailure.type` values cover all the options?
** For example, T197499 deals with a save failure because the wiki is in read-only mode, which isn't covered.
= Data tidiness =
* We should have separate this into two separate tables: `EditAttempt` (containing data that applies to all steps in an attempt, such as platform, user agent, and user name) and `EditAttemptStep` (//not// containing that attempt-wide data).
* We should probably merge `VisualEditorFeatureUse` into `EditAttemptStep` with a `featureUse` action. The observational unit is the same, and it's much easier to subset data from one table than to union data from two tables.
= See also =
* @Halfak's [2016 proposal](https://etherpad.wikimedia.org/p/schema_edit) for splitting this into five separate schemas:
** EditingSession (one per page edit session)
** EditingStage (one per editing stage)
** EditingAbort (one per aborted edit)
** EditingSaveFailure (one per save failure)
** PageContentSaveComplete (note that this schema already exists)