Page MenuHomePhabricator

Consider scrapping Schema:PageContentSaveComplete and Schema:NewEditorEdit, given we have Schema:Edit
Closed, DeclinedPublic

Description

It'd be nice to consolidate and lighten the load on the EL system. However, Schema:Edit isn't running on 100% of edits; is this going to be a problem?

Event Timeline

Actually we do not want to have large -catch it all schemas bur rather distinct schemas per event. Thus having an schema for NewEditorEdit makes a lot of sense. I would go the opposite way and say that Edit schema should be split in more meaningful sections.

@Nuria, I don't think breaking down Edits by who does the editing makes sense, but there's still a lot of breakdowns we can do of Schema:Edit. Here's an old proposal that I put together for @Jdforrester-WMF: https://etherpad.wikimedia.org/p/schema_edit

@Halfak: agreed, my meta-point is that edit schema doesn't need more data flowing into it, rather less. Should be split as you noted.

@Halfak, cool! Normalizing the schema seems like a good idea; that could be part of my work on T118063.

Milimetric claimed this task.
Milimetric subscribed.

Declining this then, in favor of future work that normalizes the schema.

I've been thinking about it, and I'm not actually sure it makes sense to normalize the schema. Even if we do that, I don't see any reason to keep NewEditorEdit. I'll keep this open while I think about it.

Milimetric moved this task from Incoming to Analytics Query Service on the Analytics board.
Milimetric moved this task from Analytics Query Service to Radar on the Analytics board.
Milimetric set Security to None.

Benefits of denormalizing:

  • Querying will be easier in general
  • Storage space will be substantially reduced
  • Performance for specialized queries (e.g. how many events per EditingSession?) will see substantial performance increases
  • Less data is sent from the browser client per event.

Performance for queries that join and filter across tables will not suffer substantially assuming we do appropriate indexing. I'm happy to help with that.

These schemas will probably be retired when schema migrate to MEP