Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repr(tag = ...) for type aliases #3659

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Prev Previous commit
Next Next commit
Third pass: replace type with discriminant, expand the section on sha…
…dowing, and alter the recommended list of lints
  • Loading branch information
clarfonthey committed Jun 21, 2024
commit 3302274fd3566c0f19a0349176dda21a0bff600e
36 changes: 27 additions & 9 deletions text/0000-repr-type-aliases.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 6,7 @@
# Summary
[summary]: #summary

Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(type = core::ffi::c_int)]` and `#[repr(type = my_type)]` are now accepted.
Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(discriminant = core::ffi::c_int)]` and `#[repr(discriminant = my_type)]` are now accepted.

# Motivation
[motivation]: #motivation
Expand All @@ -20,25 20,33 @@ For the same reasons why type aliases are useful, having type aliases in `repr`
# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Enums allow `#[repr(type = ...)]` attributes to offer an explicit discriminant type. (`...` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it.
Enums allow `#[repr(discriminant = type)]` attributes to offer an explicit discriminant type. (`type` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it. (Future RFCs like [#3607] might change this.)

To ensure compatibility, the `#[repr(type = ...)]` form is required if the type is not one of the known primitive types. Note that this form is not necessarily equivalent to using the primitive representations directly, since shadowing is possible; for example, if you did `type u32 = u8` and then `#[repr(type = u32)]`, this would be equivalent to `#[repr(u8)]`, not `#[repr(u32)]`.
[#3607]: https://github.com/rust-lang/rfcs/pull/3607

To ensure compatibility, the `#[repr(discriminant = type)]` form is required if the type is a type alias instead of a known primitive type. Depending on circumstances, it might be more beneficial to use the old `#[repr(type)]` syntax instead of `#[repr(discriminant = type)]` syntax, explained below:

> Since `type` is resolved as a path instead of being a hard-coded value, it's possible to shadow existing primitive types and cause weird results. For example, if a scope contains `type u32 = u8`, then `#[repr(discriminant = u32)` means `#[repr(u8)]`, not `#[repr(u32)]`. You can always refer to these types directly, however, and `#[repr(discriminant = ::std::u32)]` will always mean `#[repr(u32)]`.
>
> It's also worth noting that this syntax allows using type aliases with names of other `repr` arguments, like `C`. For example, `type C = u8` in the scope means that `#[repr(discriminant = C)]` means `#[repr(u8)]`, not `#[repr(C)]`.

You can use any type alias in the `repr` attribute, but it *must* be an alias to an accepted primitive type like `u8` or `i32`, and cannot be a pointer, reference, struct, etc. See the [future possibilities] section for some potential alternatives that may be allowed in the future.

You can use any type alias in the `repr` attribute, but it *must* be an alias to an accepted primitive type like `u8` or `i32`, and cannot be a pointer, reference, struct, etc.
[future possibilities]: #Future_possibilities

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The `repr` attribute now accepts a `type = ...` argument to indicate a resolved path instead of a well-known primitive type. If those the path resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation.
The `repr` attribute now accepts a `discriminant = ...` argument to indicate a resolved path instead of a well-known primitive type. If those the path resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation.

An additional, automatically-applicable lint should be added that warns a user if they use `type = ...` for a well-known primitive type, since adding `type = ` instead of using the type directly introduces the possibility of shadowing. (For example, `#[repr(type = u32)]` becomes `#[repr(u32)]`.)
An automatically-applicable lint should be added that warns a user if a `repr` argument is invalid but references in an-scope type alias without the `discriminant = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(discriminant = MyType)]`, but `#[repr(C)]` does not warn even if an alias `C` is in scope.)

Similarly, an automatically-applicable lint should be added that warns a user if a `repr` argument references in an-scope type alias without the `type = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(type = MyType)]`.)
To aid macro authors, an allow-by-default lint should be added that warns a user if they use `discriminant = ...` with a relative path instead of an absolute one, since the resolved type depends on the existing scope. For example, `discriminant = u32` or `discriminant = my_type` could potentially be shadowed by another type in the scope, but `discriminant = ::std::u32` and `discriminant = $crate::my_type` will always work as expected. This could potentially be relegated to a `clippy::pedantic` lint instead of a `rustc` lint.

# Drawbacks
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved
[drawbacks]: #drawbacks
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved

The requirement for `type =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking.
The requirement for `discriminant =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking.

And, of course, this complicates the compiler. But that's about it.

Expand All @@ -49,6 57,10 @@ We could always not do this.

But more realistically, here are some alternative designs that were rejected.

## `type`

The second proposal for this RFC used `type = ...` instead of `discriminant = ...`. Initially, this decision was chosen because `discriminant` was long to type, but it was ultimately decided that `discriminant` is more clear and that the extra typing is worth it. Additionally, RFC [#3607] proposes using `discriminant` in its proposed syntax, so, this would be further in line with that as well.
Copy link
Member

@jswrenn jswrenn Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we intend for repr(discriminant = u8) to be equivalent to repr(u8), then I'd like to propose that we perhaps shouldn't use repr(discriminant = ...) since it might complicate a future possibility for enums.

That future possibility — alluded to in this RFC — is that we someday might want to expand the accepted set of discriminant types of enums (e.g., to include things implementing StructuralEq). And, if we had such functionality, it might also be nice to declare up front what the discriminant type is. A natural syntax for this could be #[repr(discriminant = ...)].

However, repr(discriminant = ...), as proposed by this RFC, does not only set the discriminant type of an enum, but also affects its memory layout. This is not desirable for all enums. If this RFC is adopted as-is, we'd need to find an alternative way to say "the discriminant type of this enum is X, but I don't care what the enum's in-memory representation is".

Given this, I'd like to suggest that this RFC consider repr(tag = ...). Doing so has three advantages:

  1. it makes it clearer that the annotation has an affect on the enum's in-memory representation, not just its logical representation
  2. it keeps repr(discriminant = ...) syntactically reserved for the aforementioned future possibility in which the user explicitly provides the type of the discriminant
  3. it's significantly shorter than "discriminant"

If not repr(tag = ...) than perhaps something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused how labeling this a tag makes it less confusing. It's a shorter word, but that's about it.

We've already established that we're changing the memory layout by using repr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that these are often called "tagged unions" in contexts where memory layout matters, I do think the name is clearer. It's also going to be consistent with other in-flight changes to Rust's documentation: rust-lang/reference#1454 (comment)

What are the advantages of discriminant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are often called this, but we also explicitly use the term discriminant in places where it has an API meaning. See: https://doc.rust-lang.org/nightly/std/mem/fn.discriminant.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the lang meeting today, we decided on tag. We felt that the existing mem::discriminant API was unfortunate, and unloved in enough other ways, that we didn't need to weigh that heavily as precedent. See also here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very fair. I think that it would be worth proposing to deprecate & rename that API if this is the route we're going, to ensure uniformity.

Copy link
Contributor Author

@clarfonthey clarfonthey Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also, while the decision on the rename is resolved, I'm going to unresolve this until I update the RFC to rename the attribute, just so it's easier to track. Plus, it would be helpful to clarify all the finer details on how they should be written into the RFC before this gets marked as resolved.)


## `self::`

We could, instead of using `type =`, require that all types contain a double-colon to indicate they're a path, effectively preventing collisions with arguments that aren't paths. This would require using `self::` for types that are imported in the local scope, and was actually the first proposal of this RFC, but wasn't very well-received.
clarfonthey marked this conversation as resolved.
Show resolved Hide resolved
Expand Down Expand Up @@ -79,6 91,12 @@ None currently.
# Future possibilities
[future-possibilities]: #future-possibilities

Future RFCs like [#3607] propose explicit methods of obtaining enum discriminants, and that further justifies the desire to include a change like this. There aren't many other extensions that could be added, however.
In the future, more potential discriminant types could be added which might be useful:

* `repr(transparent)` structs over valid types
* Floating-point primitives
* `char` or other restricted primitives like `NonZeroU*` and `NonZeroI*`

Of course, all of these would require larger changes to the compiler than the ones proposed, and are currently left as future extensions.

[#3607]: https://github.com/rust-lang/rfcs/pull/3607