Page MenuHomePhabricator

Connect temporary and permanent account during signup in Special:CheckUser / Special:Investigate
Closed, DeclinedPublic2 Estimated Story Points

Description

When a user logged in to a temporary account signs up to a permanent (normal) account, it would probably be useful to privately record the connection between the two, in a way that's only visible with checkuser permissions. That might help with anti-abuse work: if the permanent account does something malicious and is caught, we might want to check the previously used temporary account for malicious but non-obvious activity (e.g. number change vandalism). Often this will happen via an IP check anyway, but sometimes the IP address is not a reliable tracing mechanism.

Technically, we could either just use the log_params field as it's not exposed via the API, or put it somewhere in the CU data (the latter probably makes it simpler to expose it on the CU interface).

Event Timeline

Thoughts:

  1. I presume based on the task description that temporary accounts don't convert to registered accounts, and instead the contributions are kept under the temporary account name. Couldn't the temporary account be renamed when creating the account to the requested username? This would avoid the need to make the link as it would be made as part of the signup process.
  2. Would log_params be visible via database dumps? The creation of an account is a public log entry, so as far as I understand the log_params with this information would be in the database dump.
  3. Should this link expire after 90 days? If not, then the current tables that store CU data is not an appropriate place to put this at least until T324907 is solved.
  4. If storing in CU data then the "performer" of the action would have to be the temporary account so that when checkuser is run the temporary account username would appear in the summary of the results for Special:CheckUser and the "IP and User agents" tab of Special:Investigate. This means an extra row in the CU data table (separate to the account creation) until at least T324907 is solved, as the account creation wouldn't have been performed by the temporary account (otherwise this would show in the public log entry making it stored privately here redundant)

The reason that this is dependent on T324907 for a lot is that the CheckUser extension doesn't currently store the ID for the row in the logging table in cu_changes nor does it have a copy of the log_params value from the logging table (instead storing the actiontext generated from the params in the wiki's content language).

Couldn't the temporary account be renamed when creating the account to the requested username? This would avoid the need to make the link as it would be made as part of the signup process.

See T300273#8366894

Couldn't the temporary account be renamed when creating the account to the requested username? This would avoid the need to make the link as it would be made as part of the signup process.

See T300273#8366894

Thanks for the link.

I presume based on the task description that temporary accounts don't convert to registered accounts, and instead the contributions are kept under the temporary account name. Couldn't the temporary account be renamed when creating the account to the requested username? This would avoid the need to make the link as it would be made as part of the signup process.

This comes up a lot, we should probably document it in T300273: [IP Masking] Temporary account to registered account creation flow or a deciated task. We don't want to convert or migrate accounts because 1) the temporary account might have been created on a public computer, so it's hard to be sure that the edits really belong to the user who is doing the registration; 2) since way more people have access to the IPs used by a temp account, publicly connecting it to the permanent account would expose the IP of someone who can then later become a long-term user with higher privacy needs than a temp account.

Would log_params be visible via database dumps? The creation of an account is a public log entry, so as far as I understand the log_params with this information would be in the database dump.

Good point, seems like log_params are not actually filtered. (Although I imagine non-public logs are filtered out somehow? Not sure how that works.) The log_search table is private so could be stored as log relations instead of log parameters. But putting CU-only data in one of the CU tables would be more reassuring.

Should this link expire after 90 days? If not, then the current tables that store CU data is not an appropriate place to put this at least until T324907 is solved.

Probably. I filed this task mainly to inquire about the legal/privacy expectations for this information.

If storing in CU data then the "performer" of the action would have to be the temporary account so that when checkuser is run the temporary account username would appear in the summary of the results for Special:CheckUser and the "IP and User agents" tab of Special:Investigate. This means an extra row in the CU data table (separate to the account creation) until at least T324907 is solved, as the account creation wouldn't have been performed by the temporary account (otherwise this would show in the public log entry making it stored privately here redundant)

I was thinking of making the newly registered account to be the performer, and record the temporary username somewhere as extra data. Not sure if that would integrate with the checkuser UI in a useful way.

Thanks for some quick answers.

Should this link expire after 90 days? If not, then the current tables that store CU data is not an appropriate place to put this at least until T324907 is solved.

Probably. I filed this task mainly to inquire about the legal/privacy expectations for this information.

The reason I thought of this is because of the task T170148 and how that would relate to this if it's implemented. It has legal approval, so once T324907 is implemented I intend to push that forward as it would become possible to implement (not summarising what it's about as it's a security ticket).

If storing in CU data then the "performer" of the action would have to be the temporary account so that when checkuser is run the temporary account username would appear in the summary of the results for Special:CheckUser and the "IP and User agents" tab of Special:Investigate. This means an extra row in the CU data table (separate to the account creation) until at least T324907 is solved, as the account creation wouldn't have been performed by the temporary account (otherwise this would show in the public log entry making it stored privately here redundant)

I was thinking of making the newly registered account to be the performer, and record the temporary username somewhere as extra data. Not sure if that would integrate with the checkuser UI in a useful way.

As the extension currently stands, this way wouldn't integrate nicely with the UI.

Once T324907 is done it could be technically possible to achieve using the current way to display events in CheckUser. However, it would make the database queries more complicated and would require a fair amount of work to achieve for all interfaces to the data (API, Special:CheckUser and Special:Investigate).

An example of this added complication is that the query to indicate to the checkuser that another account has used an IP on Special:CheckUser compares differences in the username of the performer to the username that is checked. An exception in the query or a different query would be needed to indicate that a temporary account was used in the signup process.

A separate interface to show it, even if that's a link to the temporary account username on the pre-existing pages, could be a better way to show this.

If it's not easily doable, let's just drop the idea. My premise would that it would be very little effort to add this; I don't think the benefits are large enough to justify doing much work.

If it's not easily doable, let's just drop the idea. My premise would that it would be very little effort to add this; I don't think the benefits are large enough to justify doing much work.

As long as the "performer" of the action can be the temporary account, this would be much easier to achieve.

The connection is already publicly recorded – the account creation log entry at Special:Log/newusers will read something like "User account Foo was created by *Unregistered N", if you create the permanent account while using a temporary account. I guess this is a bit of a happy accident though, so we should confirm that it's desirable.

The connection is already publicly recorded – the account creation log entry at Special:Log/newusers will read something like "User account Foo was created by *Unregistered N", if you create the permanent account while using a temporary account. I guess this is a bit of a happy accident though, so we should confirm that it's desirable.

I'm a little confused now. My understanding is that they wouldn't be linked at all, but perhaps I am wrong.

If this is the intended behaviour, then this task can be closed as CheckUser already does this. Otherwise if the link won't be made in the account log (in the spirit of why T300273#8366894 was done) then this task would need to stay open. I've just tried what happens right now and it would produce:

image.png (290×1 px, 87 KB)

I guess this is a bit of a happy accident though

It's an unhappy accident, and will be fixed at some point before IP masking is introduced in production. The question here is whether to keep that information in a more private way somewhere.

As long as the "performer" of the action can be the temporary account, this would be much easier to achieve.

Isn't the situation simmetric? One is more easily handled when a CU is looking at the temp account, the other when they are looking at the registered account?

As long as the "performer" of the action can be the temporary account, this would be much easier to achieve.

Isn't the situation simmetric? One is more easily handled when a CU is looking at the temp account, the other when they are looking at the registered account?

Unfortunately no.

Assume for both examples that the only action the temporary account and registered account has associated with a IP is to create said registered account.

Example of this for Special:CheckUser:

Special-checkuser-get-ips_en.png (765×922 px, 23 KB)

  • The IPs with "~X from all users" indicates that a logged out user or another account has used that IP in the last 3 months. The absence of this indicates that just this account used that IP in the last 3 months. This information indicates how many actions all users have performed on this IP and is hidden only if all the actions on this IP are made by the account being checked.
  • If the "performer" of the action is the temporary account, then "~X from all users" will appear indicating that another account has made an action on this IP (in this case the creation of the account when using a temporary account)
  • Otherwise (assuming no other usage of this IP), "~X from all users" does not appear which indicates that no other account has used that IP and by extension the link to the temporary account cannot be seen as the checkuser would have no reason to check the IP directly.

For Special:Investigate:

IP_options.png (1×2 px, 384 KB)

  • The CheckUser adds the IP used to create the account to the investigation
  • If the "performer" of the action is the temporary account, then the temporary account will appear as a row in this table
  • Otherwise the temporary account will not appear in this table
  • The only way to see this connection would be in the "Timeline" tab after scrolling down to the bottom.

When checking the temporary account using the same assumptions:

  • If the "performer" of this is the temporary account, then a separate row exists from the public log entry under the name of the registered account for the "login" and account creation so in both examples above the registered account username would show or indications of other account use would be shown.
  • Otherwise, the account creation never shows up in Special:CheckUser or Special:Investigate when just checking the temporary account because no action regarding the creation is associated with the registered account which means the IP used never appears in the results.

Hopefully my explanation makes sense? If not let me know.

Note: currently temporary and permanent account are connected publicly. cf T357498: Temp account creations do not appear in Special:Log

Another point to debate is if the relation of temporary and permanent account is not public, should it be available infinitely? Temporary account itself does not contain PII as long as IPs are removed after 90 days, but there are some edge cases: temporary user session may be somehow preserved after creation of permanent account, either due to (1) having a temporary session in one wiki and regular one on another due to failure of central login, or (2) replicaton of session (e.g. by some backup/sync feature of browser), so IPs would be available until 90 days of last temporary account action, which may be more than 90 days after regular account creation if we do not invalidate the temporary account after creation of a permanent one.

kostajh subscribed.

Do we want to privately record the link between a temp account and a permanent account, when a temp account creates a permanent account? cc @Dreamy_Jazz @Niharika

Do we want to privately record the link between a temp account and a permanent account, when a temp account creates a permanent account? cc @Dreamy_Jazz @Niharika

That seems sensible to me and should be possible to achieve.

Status quo:

Temp account creation
image.png (422×2 px, 100 KB)
No indication that the temp account created and logged in as a named account, and that the temporary account session ended
Named account creation from a temp account
image.png (350×1 px, 82 KB)
No indication that the named account was created from a temporary account

Status quo:

Temp account creation
image.png (422×2 px, 100 KB)
No indication that the temp account created and logged in as a named account, and that the temporary account session ended
Named account creation from a temp account
image.png (350×1 px, 82 KB)
No indication that the named account was created from a temporary account

My proposed changes:

ScenarioProposed log entryNotes
When viewing a temporary account that created a named account{temporary account} created named account {named account}Needed to see the connection to a named account, when viewing a temp account
When viewing a temporary account that logged-in to an existing named account{temporary account} logged-in to named account {named account}Needed to see the connection to a named account, when viewing a temp account
When viewing a named account that was created by a temp accountUser account {named account} was created by {temp account}Needed to see the connection to a temp account, when viewing a named account
When viewing a named account that was logged-in to by a temp accountUser account {named account} was logged-in to by {temp account}Needed to see the connection to a temp account, when viewing a named account

@kostajh To confirm, this is a private record correct? Will this be exposed through any logs or features?

@kostajh To confirm, this is a private record correct? Will this be exposed through any logs or features?

This is private, in the Special:CheckUser interface.

I think this might be something we want Legal to review before we build it.

I think this might be something we want Legal to review before we build it.

I'm not sure I see this point, considering that users who have access to Special:CheckUser could see this event happen through inference. For example, a temporary account makes an edit and then the next action on that IP is a named account signing in.

The only situation that this inference cannot be made is if the user switches IPs in-between their last edit using the temporary account and signing in to a named account / creating an account.

Dreamy_Jazz renamed this task from Connect temporary and permanent account during signup to Connect temporary and permanent account during signup in Special:CheckUser / Special:Investigate.Sep 26 2024, 10:34 AM

I think this might be something we want Legal to review before we build it.

I'm not sure I see this point, considering that users who have access to Special:CheckUser could see this event happen through inference. For example, a temporary account makes an edit and then the next action on that IP is a named account signing in.

The only situation that this inference cannot be made is if the user switches IPs in-between their last edit using the temporary account and signing in to a named account / creating an account.

Yes, I was drafting more or less the same comment. This makes it easier to do anti-abuse work. Without doing the work in this task, it's possible to make the connection, it just takes more manual effort.

kostajh changed the task status from Open to Stalled.Sep 30 2024, 7:28 PM
kostajh removed kostajh as the assignee of this task.
kostajh added a subscriber: PBradley-WMF.

This is pending a comment from Legal (cc @PBradley-WMF); marking as stalled for now.

We (WMF Legal) are grateful for the follow-up discussions we had with @kostajh , @Dreamy_Jazz, @Niharika , @Urbanecm_WMF and others. As per the fuller rationale we’ve written up off-Phab, we’re unconvinced that the benefits of logging account ancestry(1), as proposed here, would outweigh the privacy implications, so we’re advising against doing that for now.

(1) "account ancestry": what I'm calling a record of the fact that a temp account has registered another account.