Deduping Civi will not only clean up our database, but include new tools to prevent dupes from being added to the system moving forward. This will always rely on some degree of manual intervention, but we'll automate what we can.
Phase 1
- Scan for potential duplicates in a background job, tagging similar pairs and annotating with comparison results.
Phase 2
- Display potential duplicates in a GUI table. These will be broken out (qualitatively) by automated match confidence, and split into a batched workflow for humans.
- Allow admins to mark suspected duplicate pairs as a confirmed match.
- Gather feedback and determine whether we can skip manual review of the high-confidence match categories.
Phase 3
- Perform actual merging. This will require some work to make merging a reversible operation.
At the writing of this, there is an imminent phabricator upgrade which will change the story point field. The previous value was "supersized". It is now 0 and will need a new number at a later date.