Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove thing_type and thing_id columns from settings table #31971

Merged
merged 1 commit into from
Dec 5, 2024

Conversation

mjankowski
Copy link
Contributor

I'm mostly sure (based on code comments and commit history) we want to do this, but less sure about WHEN to do this. If this is ready to be done now, go ahead and review ... if not, wait until post-4.3 (or even later?) release.

The history here I believe is that at one point user settings were stored one setting per row in this table, but have since migrated over to a (text) settings column on users table (wrapped by custom JSON coder at ruby level), leaving this settings table to just handle the site-wide setting values (the Setting model has a default scope to operate only where these columns are null).

Things to confirm:

  • I did this as a "post deploy" migration on assumption that the ignore columns on code reload get us what we want there and it's desired to run that way
  • I removed the previous index and added a unique index on var, which I assume is desirable for the app-wide settings.
  • With that index added, the fabricator definition needed a sequence to be able to produce two valid records
  • Would it make sense to first delete all rows which have any values set in these columns? Might be a good final cleanup to make sure nothing collides post-removal, and would probably make the index rebuild faster if we are confident we are only left with the (relatively few) server side settings ... also once we remove these columns, it would be more challening to tell the previously-connected-to-a-user (if any left) rows from the server wide ones.

@ClearlyClaire
Copy link
Contributor

  • Would it make sense to first delete all rows which have any values set in these columns? Might be a good final cleanup to make sure nothing collides post-removal, and would probably make the index rebuild faster if we are confident we are only left with the (relatively few) server side settings ... also once we remove these columns, it would be more challening to tell the previously-connected-to-a-user (if any left) rows from the server wide ones.

Yes. They would need to be deleted. And the new index built before the previous one gets deleted.

@mjankowski mjankowski force-pushed the remove-settings-thing-cols branch from 524990c to 311d2c1 Compare September 19, 2024 13:32
@mjankowski
Copy link
Contributor Author

Updated migration to:

  • Do a delete of all settings with type/id not null
  • Move new index creation to before old index drop

There's a (legit, I think) issue here with an old migration picked up by the historical migrations test, as side effect of using the Setting class in the migration. I suspect it's a serialization thing, but need to look closer. Will update that to either remove the class or inline a definition, whatever is simpler.

@mjankowski mjankowski force-pushed the remove-settings-thing-cols branch from 9996da2 to 1423747 Compare September 19, 2024 13:54
@mjankowski
Copy link
Contributor Author

Updated old migration to add YAML value wrapper to Setting class, which I think mirrors the behavior expected at that point in time. (Also, this is ~2YO migration, seems low risk to modify, but does keep migration test happy).

ClearlyClaire
ClearlyClaire previously approved these changes Sep 19, 2024
Copy link
Contributor

@ClearlyClaire ClearlyClaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible issue is that the post-deployment migration is potentially long and complex, which means increased risk of it failing or being interrupted mid-way, and it disabling transactions means re-running it may be an issue.

Otherwise it looks good to me, but I'd probably wait after 4.3 at that point.

@mjankowski
Copy link
Contributor Author

One possible issue is that the post-deployment migration is potentially long and complex, which means increased risk of it failing or being interrupted mid-way, and it disabling transactions means re-running it may be an issue.

Yeah ... strong migrations wanted the "concurrently" on the index addition, which also required the disable ddl line.

One workaround there might be to put JUST the addition of that index in it's own (earlier) migration with disable_ddl, and then everything else in a transaction-wrapped migration to follow up after? Would make the "what do I need to re-run" easier to find if anyone hits interruption issues.

Otherwise it looks good to me, but I'd probably wait after 4.3 at that point.

Yes.

@ClearlyClaire
Copy link
Contributor

One workaround there might be to put JUST the addition of that index in it's own (earlier) migration with disable_ddl, and then everything else in a transaction-wrapped migration to follow up after? Would make the "what do I need to re-run" easier to find if anyone hits interruption issues.

Wouldn't the deletion need to happen before creating the index, though? Otherwise you'd have conflicts for every user variable.

@mjankowski
Copy link
Contributor Author

Wouldn't the deletion need to happen before creating the index, though? Otherwise you'd have conflicts for every user variable.

Ah, yes good catch ... you'd need a) transaction-wrapped delete migration, b) transaction-disabled concurrent index addition, c) transaction wrapped the rest. Three migrations seems overkill for the changes.

@mjankowski mjankowski added refactoring Improving code quality ruby Pull requests that update Ruby code labels Sep 26, 2024
Copy link
Contributor

github-actions bot commented Oct 7, 2024

This pull request has merge conflicts that must be resolved before it can be merged.

Copy link
Contributor

github-actions bot commented Oct 7, 2024

This pull request has resolved merge conflicts and is ready for review.

@mjankowski mjankowski force-pushed the remove-settings-thing-cols branch from 7b0c94c to 49608f7 Compare October 13, 2024 21:16
@mjankowski mjankowski force-pushed the remove-settings-thing-cols branch from 49608f7 to 7e0e529 Compare October 31, 2024 16:18
@mjankowski mjankowski force-pushed the remove-settings-thing-cols branch 2 times, most recently from 800b4c0 to 7529205 Compare November 27, 2024 21:40
@mjankowski
Copy link
Contributor Author

On this one - I know we were in wait until post-4.3 mode, but I dont know if the issues re: migration setup were resolved or not, or if theres still more to do?

@ClearlyClaire
Copy link
Contributor

ClearlyClaire commented Nov 28, 2024

This is rather awkward, but…

I think we need to delete the thing_id: nil, thing_type: nil records in a pre-deployment migration, otherwise they would be incorrectly picked up by the new code, before the post-deployment migrations run.

However, this also means you can't safely upgrade anymore from a pre-4.2.0 version to a version with this zero-downtime migration. This is already a restriction we have since the 4.3.0 release because of the cookies change, and it's not the first time we are hitting such roadblocks, so maybe it's time to formalize such “breakpoints” and have e.g. some additional pre-migration check to ensure you've reached a specific breakpoint before running any more migrations with SKIP_POST_DEPLOYMENT_MIGRATIONS (e.g. #33089).

@mjankowski
Copy link
Contributor Author

I think we need to delete the thing_id: nil, thing_type: nil records in a pre-deployment migration, otherwise they would be incorrectly picked up by the new code, before the post-deployment migrations run.

We have ignored_columns added here, which I think means from the perspective of loading records we are safe -- but then we also are removing the default scope here, which means that the moment the code is deployed, if the migration has not run yet, we are potentially loading what should have been a user setting as a site setting (if there were var name overlap)?

@ClearlyClaire
Copy link
Contributor

but then we also are removing the default scope here, which means that the moment the code is deployed, if the migration has not run yet, we are potentially loading what should have been a user setting as a site setting (if there were var name overlap)?

Yes, that's my concern. There are some variables with overlap between system variables and user variables: theme, trends, noindex, possibly more. If running the new code before deleting the records, random user variables may be picked up instead of system variables.

@mjankowski
Copy link
Contributor Author

Hmm, yeah. Looking throught the diff here, I think...

  • The addition of ignored_columns could be added any time since we're already not using these?
  • Fabricator change is needed only after the schema change, but in theory could go in any time
  • Old migration adding setting class is only needed after change, but also could go in any time

It's really only the itneraction between the scope and the migration that has timing/dependency issues.

We currently have a mix of setting records:

  • In some, thing_id and thing_type are present and hold values for users -- however none of these are used anywhere. They are still there as side effect of not cleaning up previously
  • In some, thing_id and thing_type are null, and do hold currently used and needed sitewide settings

If we pull out the scope before running the migration, we'll hit (old, unused) user settings for some where names collide.

If we run the migration first w/out changing scope, once the migration finishes the scope will be referring to now-nonexistent columns are start erroring.

What if we...

  • Add a migration which does nothing but delete the data where thing_id and thing_type are not null, but does not yet modify the indexes or drop columns
  • Get that change only (and maybe the other harmless stuff here) into a release
  • In some future release, remove the scope and do the index/drop portions ... I think this will be ok because we will be confident that even though the default scope is gone, the user records will have all be removed

@mjankowski mjankowski marked this pull request as draft November 29, 2024 13:54
@ClearlyClaire
Copy link
Contributor

The addition of ignored_columns could be added any time since we're already not using these?

We're currently using them in the default scope.

Add a migration which does nothing but delete the data where thing_id and thing_type are not null, but does not yet modify the indexes or drop columns

As I said earlier, this should be pre-deployment migration (otherwise new code will pick up the old values).

But it requires 4.2.0 code to be running otherwise new records with non-nil columns may be created. Therefore I think this first step still needs a breakpoint, such as what I'm proposing in #33089

But such a breakpoint already exists because of the cookies migration, even if it's not enforced by a task check.

@mjankowski
Copy link
Contributor Author

We're currently using them in the default scope.

Wont the scope keep including them in the where clause even if they are in ignored_columns?

As I said earlier, this should be pre-deployment migration (otherwise new code will pick up the old values).

Yes, if we pulled out the deletion portion only to stand alone migration that could be normal migration.

But it requires 4.2.0 code to be running otherwise new records with non-nil columns may be created. Therefore I think this first step still needs a breakpoint, such as what I'm proposing in #33089

I'll check that one out ... isn't there already a policy/guideline of "you must have migrated all the way to end of a last minor version before bumping to next one"? Maybe that's not code enforced - but that's the expectation, right?

But such a breakpoint already exists because of the cookies migration, even if it's not enforced by a task check.

👍

@mjankowski mjankowski force-pushed the remove-settings-thing-cols branch from 7529205 to 0edc9f3 Compare November 29, 2024 14:38
@ClearlyClaire
Copy link
Contributor

Wont the scope keep including them in the where clause even if they are in ignored_columns?

I haven't tried, but I don't think so.

I'll check that one out ... isn't there already a policy/guideline of "you must have migrated all the way to end of a last minor version before bumping to next one"? Maybe that's not code enforced - but that's the expectation, right?

No, that is not. On the contrary, we try to support skipping version updates, if only because sometimes intermediary versions become uninstallable for one reason or another. But I think it's reasonable to not support zero-downtime migrations when skipping versions.

@mjankowski
Copy link
Contributor Author

I haven't tried, but I don't think so.

Running with the default scope, but without the ignored columns...

mastodon(dev)> Setting.count
  Setting Count (1.7ms)  SELECT COUNT(*) FROM "settings" WHERE "settings"."thing_type" IS NULL AND "settings"."thing_id" IS NULL
=> 5
mastodon(dev)> Setting.last
  Setting Load (0.3ms)  SELECT "settings".* FROM "settings" WHERE "settings"."thing_type" IS NULL AND "settings"."thing_id" IS NULL ORDER BY "settings"."id" DESC LIMIT $1  [["LIMIT", 1]]

When I add the ignored columns...

mastodon(dev)> Setting.count
  Setting Count (0.9ms)  SELECT COUNT(*) FROM "settings" WHERE "settings"."thing_type" IS NULL AND "settings"."thing_id" IS NULL
=> 5
mastodon(dev)> Setting.last
  Setting Load (0.3ms)  SELECT "settings"."id", "settings"."var", "settings"."value", "settings"."created_at", "settings"."updated_at" FROM "settings" WHERE "settings"."thing_type" IS NULL AND "settings"."thing_id" IS NULL ORDER BY "settings"."id" DESC LIMIT $1  [["LIMIT", 1]]

So - looks like ignored columns changes the selected columns from * to all-but-ignored, but does not interact with the conditions.

But I think it's reasonable to not support zero-downtime migrations when skipping versions.

Fun.

@mjankowski mjankowski force-pushed the remove-settings-thing-cols branch from 0edc9f3 to 240bc24 Compare December 5, 2024 13:33
@mjankowski mjankowski force-pushed the remove-settings-thing-cols branch from 240bc24 to aac3447 Compare December 5, 2024 14:13
@mjankowski mjankowski marked this pull request as ready for review December 5, 2024 14:19
@ClearlyClaire ClearlyClaire added this pull request to the merge queue Dec 5, 2024
Merged via the queue into mastodon:main with commit 17c02c9 Dec 5, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Improving code quality ruby Pull requests that update Ruby code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants