To implement T253796, table(s) should be created that allow storing log-specific information in a seperate table. When CheckUser queries are made, the tables will be joined so that results can continue to be shown.
This involves creating two new tables:
* cu_log_event
* cu_private_event
The cu_log_event table will contain entries that at the moment go into cu_changes that also:
* Are log actions
* Have a associated log ID in the logging table
The cu_private_event will contain entries that at the moment go into cu_changes that also:
* Are log actions
* Do not have an associated log ID (and thus are very unlikely to be shown publicly or be linked to)
The two tables are needed because when we have a log ID to reference storing the log parameters and other data that is not needed for indexing purposes can be gained using a join on the log ID to the logging table. Entries which do not have a associated log ID would then have to store this data, but to ensure that a new table is not polymorphic two tables are needed so that some entries have this column not be used.
Examples:
* Log for a page move would go into the cu_log_event
* Edit to a page would go into cu_changes
* Login event would go into cu_private_event
* Log events previously stored in cu_changes would be moved to cu_private_event
Doing this allows the solving of several security tickets and solving / making substantial progress towards solving:
* T253796
* T145265
* T311380
* T315488
* T41013
* T26231
This method was suggested by Ladsgroup to reduce the polymorphic nature of cu_changes over just adding cuc_log_id as suggested in the parent task. This would solve the parent task by removing log entries from cu_changes.
Todo (mostly in order but some can be done before others as desired):
* [x] Create the new tables
** [x] Create the schemas for these tables
** [x] Deploy the new tables to WMF production
* [x] Update code that writes and deletes to cu_changes to also use the new tables based on a schema comptability config (write new support)
** [x] The many methods that insert, delete and change cu_changes in Hooks.php - [[ https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/ /876360 | gerrit 876360 ]]
** [x] PopulateCheckUserTable maintenance script - [[ https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/ /877182 | gerrit 877182 ]]
** [x] PurgeOldData maintenance script - [[ https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/ /877182 | gerrit 877182 ]]
* [x] Add combined read and write new support - T341586
* [x] Add read new support so that the code that queries CheckUser reads from all three tables:
** [x] CheckUserQueryBuilder - [[ https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/ /881420 | gerrit 881420 ]]
** [x] LogFormatter-like class for cu_private_event rows (T341688)
** [x] CheckUser
*** [x] Get edits pager
*** [x] Get IPs pager
*** [x] Get users pager
** [x] CheckUser API (T341827)
** [x] Investigate (T329189)
** [x] Temporary account IP reveal
* [x] Create a maintenance script to move log entries from cu_changes to cu_private_event, while keeping the moved entries in cu_changes with `cuc_only_for_read_old` set to `1` - [[ https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/ /879919 | gerrit 879919 ]] (merged) [[ https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/ /886482 | gerrit 886482 ]]
* [x] Enable write new (T330158)
** [x] Change the value in extension.json
** [x] Do this on WMF wikis
*** [] Double check the new tables are not included any DB dumps - Ladsgroup probably the one to contact for this
* [x] Move log entries in cu_changes to cu_private_event (first maintenance script)
** [x] Add this to update.php while ensuring that it is run before any column removal but after the creation of cu_private_event and cu_log_event
** Have this maintenance script run on WMF wikis - No longer needed as data on WMF wikis has been purged that needed the move
* [x] Enable read new
** [x] Change the value in extension.json
** [x] Do this on WMF wikis
* [x] Stop writing old
** [x] In extension.json - T366505
** [x] Have this changed on WMF wikis - T360685
* [x] Delete cu_changes rows with cuc_only_for_read_old set to 1 - T341830
** [x] Create a maintenance script for this
** [x] Add this maintenance script to update.php while ensuring it it run before column removal but after the moving of log entries
** [x] Run this maintenance script on WMF wikis (T366781)
* [x] Remove `wgEventTablesMigrationStage` config, inferring the value as `SCHEMA_COMPAT_NEW` in places that it was used (T366546)
* [] Remove columns related only log types in cu_changes along with `cuc_only_for_read_old` (T366782)
** [] Remove these columns from WMF production - requires maintenance script to move log entries be run if not more than 3 months since write new was enabled. This should also implicitly run `optimise cu_changes` for WMF wikis (needed because on `enwiki` it should be a ~20% drop and on `loginwiki` it should be a 99.9% drop in row count).