Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for const fn type_id #77125

Open
4 tasks
KodrAus opened this issue Sep 23, 2020 · 24 comments
Open
4 tasks

Tracking Issue for const fn type_id #77125

KodrAus opened this issue Sep 23, 2020 · 24 comments
Labels
A-const-fn Area: const fn foo(..) {..}. Pure functions which can be applied at compile time. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. S-tracking-impl-incomplete Status: The implementation is incomplete. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@KodrAus
Copy link
Contributor

KodrAus commented Sep 23, 2020

This is a tracking issue for using TypeId::of in constant contexts.
The feature gate for the issue is #![feature(const_type_id)].

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

  • How can we prevent abuses of transmuting type ids into integers at compile time (where we have fewer tools to deal with them)? See the discussion in Widen TypeId from 64 bits to 128. #75923
  • We need a new scheme for building type ids that's collision resistant. We'd like to do this before stabilizing type ids in constant contexts

Implementation history

@KodrAus KodrAus added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Sep 23, 2020
@KodrAus KodrAus added T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. A-const-fn Area: const fn foo(..) {..}. Pure functions which can be applied at compile time. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Sep 23, 2020
@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Nov 6, 2020
@benkay86
Copy link

benkay86 commented Jun 2, 2021

I'm not sure I totally understand the steps needed to re-stabilize const fn TypeId::of().

  • Come up with a new scheme for collision resistant type ids
  • We need a new scheme for building type ids that's collision resistant. We'd like to do this before stabilizing type ids in constant contexts

Does this affect the soundness of const fn TypeId::of()? If we are really worried about collisions, then shouldn't the entirety of TypeId be unstable, and not just the const version? Or are we worried that a future collision-resistant implementation of TypeId may not satisfy const, i.e. we want to keep open the possibility of a non-const implementation in the future?

  • Figure out any safeguards we need to prevent messing with the bits of type id directly
  • How can we prevent abuses of transmuting type ids into integers at compile time (where we have fewer tools to deal with them)? See the discussion in Widen TypeId from 64 bits to 128. #75923

This seems impossible. If someone wants to transmute() to the type of the private implementation (i.e. u64) in an unsafe code block, even though the documentation for TypeId explicitly says not to do this, can we really stop them? (They can already do it at runtime, not sure what "tools" prevent this.) Can we really stop anyone from doing unsafe things with unsafe code?

Motivation: It would be extremely useful to make const comparisons between TypeId in order to infer whether two types are equal at compile time. Currently this only appears to be possible using specialization (#31844), which is unsound and will not be stabilized anytime soon.

A perhaps related concern (may need a separate tracking issue) is that #derive[(PartialEq, Eq, ...)] struct TypeId does not generate const impl for these traits (see #77695), so even if const fn TypeId::of() gets re-stabilized we still won't be able to compare types at compile time without writing a const impl PartialEq for TypeId, etc.

@jmeggitt
Copy link

jmeggitt commented Aug 9, 2021

Can this be re-stabilized now?

@KodrAus Going through the comment that was the basis of reverting stabilization (#75923 (comment)):

Part 1

TypeId is one a few types (mem::Discriminant is the only other one I can think of) that's a thin wrapper for a private integer, at least right now. This is ripe for abuse no matter what we say/document it as.

Sure, but this suggests that retrieving a TypeId should be unsafe not unstable. Any is currently both stable and safe despite heavily relying on const fn TypeId::of(). If there are serious concerns that const fn TypeId::of() may result in collisions then Any should be unstable as well.

Part 2

I'm not saying we should cater to this, but rather avoid stabilizing even worse abuse-enabling tools like const fn TypeId::of.
Like I said, I am already aware of usecases where compile-time transmute(TypeId::of::<T>()) is an uniquely useful tool and someone might not think twice before releasing a crate which uses it somewhere internally and becomes widely-used.

I think @benkay86 summed this up fairly well.

If someone wants to transmute() to the type of the private implementation (i.e. u64) in an unsafe code block, even though the documentation for TypeId explicitly says not to do this, can we really stop them? (They can already do it at runtime, not sure what "tools" prevent this.) Can we really stop anyone from doing unsafe things with unsafe code?

Part 3

And finally, @eddyb remarks that we could potentially re-stabilize it if we are unlikely to change it.

If we decide we're unlikely to change the implementation of TypeId, then we can re-stabilize the const fn aspect, and live with the consequences. But if we do want to change the implementation away from a hash, we have to either do that first, or at least teach miri about it somehow so that it disallows using the value as a plain integer (for arithmetic or e.g. as an array type length).

It has been nearly 10 months since it was marked unstable and there have been no changes regarding this issue. At this point it seems like the implementation of TypeId is unlikely to change any time in the near future.

TL;DR

Plus even if we assume all of these points are true, these concerns only apply to intrinsics::type_id::<T>() -> u64 not const fn TypeId::of::<T>() -> TypeId.

If anything, the heavy reliance on const fn TypeId::of() inside Any suggests that it is stable but could potentially be unsafe to use directly. If there are concerns that TypeId can be abused without unsafe code then const fn TypeId::of() should be marked as unsafe instead. However I do not think this is necessary since you would need to use transmute() to extract the raw type id integer from TypeId.

intrinsics::type_id::<T>() should remain unstable as it provides access to the raw type id integer, but const fn TypeId::of() should be stabilized as it provides adequate protection.

@eddyb
Copy link
Member

eddyb commented Aug 9, 2021

At this point it seems like the implementation of TypeId is unlikely to change any time in the near future.

Nobody tried AFAIK. 10 months is pretty short on the scale of Rust evolution, especially when you take into account the year 2020.

But also I believe it got stalled because of this potential breakage: #75923 (comment)

In order to guarantee code that transmutes to u64 gets broken, we can't just replace TypeId with a pointer, but because we still want the 64-bit hash (for both "fast reject (!=)" and Hash, see #75923 (comment)), we can probably do this:

struct TypeId {
    hash: u64,
    // `str` can be replaced with a length-prefixed string to keep `TypeId`'s size down
    v0_mangled_type: &'static str,
}
impl PartialEq for TypeId {
    fn eq(&self, other: &TypeId) -> bool {
        self.hash == other.hash // fast reject (!=)
        && (
            self.v0_mangled_type as *const str == other.v0_mangled_type as *const str // fast accept (==)
            || self.v0_mangled_type == other.v0_mangled_type
        )
    }
}

That way, it's always more than 8 bytes, no matter the target pointer size, and so a transmute to u64 will always error.

As for the linker concerns, if we take this specific approach, the only difference relying on their features would make is increasingly the situations in which the "fast accept (==)" path gets taken, i.e. an optimization.

I could try make a PR for this, but it would likely at least require #87194 (to avoid breaking when unstable const generic features are used).

If there are concerns that TypeId can be abused without unsafe code then const fn TypeId::of() should be marked as unsafe instead.

But we don't want that. You can't change TypeId or Any to be unsafe, they're stable, and nearly-sound.

@programmerjake
Copy link
Member

struct TypeId {
    hash: u64,
    // `str` can be replaced with a length-prefixed string to keep `TypeId`'s size down
    v0_mangled_type: &'static str,
}
impl PartialEq for TypeId {
    fn eq(&self, other: &TypeId) -> bool {
        self.hash == other.hash // fast reject (!=)
        && (
            self.v0_mangled_type as *const str == other.v0_mangled_type as *const str // fast accept (==)
            || self.v0_mangled_type == other.v0_mangled_type
        )
    }
}

Maybe it would be better to use this to shrink TypeId to only 16 bytes on 64-bit platforms:

use core::{ptr, slice};

#[repr(C)]
struct StrWithLen<T: ?Sized> {
    len: u16, // assume all TypeId strings are <64kB
    text: T,
}

pub struct TypeId {
    hash: u64,
    text: &'static StrWithLen<()>, // actually StrWithLen<str>
}

impl PartialEq for TypeId {
    fn eq(&self, rhs: &Self) -> bool {
        if self.hash != rhs.hash {
            false
        } else if ptr::eq(self.text, rhs.text) {
            true
        } else {
            unsafe {
                let lhs_text = slice::from_raw_parts(
                    &self.text.text as *const _ as *const u8,
                    self.text.len as usize,
                );
                let rhs_text = slice::from_raw_parts(
                    &rhs.text.text as *const _ as *const u8,
                    rhs.text.len as usize,
                );
                lhs_text == rhs_text
            }
        }
    }
}

@eddyb
Copy link
Member

eddyb commented Aug 10, 2021

Yeah, that's what I meant by "length-prefixed". There's a "standard" way to do that using extern { type }, I just didn't want to type it in.

@eddyb
Copy link
Member

eddyb commented Apr 9, 2022

Update: finally opened #95845 with a (v0) mangling-based TypeId representation (should unblock const fn TypeId::of, while at the same time making specific uses of TypeId incompatible with CTFE, for now).

@lcnr lcnr added the S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. label May 18, 2022
@lcnr
Copy link
Contributor

lcnr commented May 18, 2022

blocking the stabilization of this on #97156

bors bot pushed a commit to bevyengine/bevy that referenced this issue Jun 9, 2022
…4042)

# Objective

> Resolves #4504

It can be helpful to have access to type information without requiring an instance of that type. Especially for `Reflect`, a lot of the gathered type information is known at compile-time and should not necessarily require an instance.

## Solution

Created a dedicated `TypeInfo` enum to store static type information. All types that derive `Reflect` now also implement the newly created `Typed` trait:

```rust
pub trait Typed: Reflect {
  fn type_info() -> &'static TypeInfo;
}
```

> Note: This trait was made separate from `Reflect` due to `Sized` restrictions.

If you only have access to a `dyn Reflect`, just call `.get_type_info()` on it. This new trait method on `Reflect` should return the same value as if you had called it statically. 

If all you have is a `TypeId` or type name, you can get the `TypeInfo` directly from the registry using the `TypeRegistry::get_type_info` method (assuming it was registered).

### Usage

Below is an example of working with `TypeInfo`. As you can see, we don't have to generate an instance of `MyTupleStruct` in order to get this information.

```rust
#[derive(Reflect)]
struct MyTupleStruct(usize, i32, MyStruct);

let info = MyTupleStruct::type_info();
if let TypeInfo::TupleStruct(info) = info {
  assert!(info.is::<MyTupleStruct>());
  assert_eq!(std::any::type_name::<MyTupleStruct>(), info.type_name());
  assert!(info.field_at(1).unwrap().is::<i32>());
} else {
  panic!("Expected `TypeInfo::TupleStruct`");
}
```

### Manual Implementations

It's not recommended to manually implement `Typed` yourself, but if you must, you can use the `TypeInfoCell` to automatically create and manage the static `TypeInfo`s for you (which is very helpful for blanket/generic impls):

```rust
use bevy_reflect::{Reflect, TupleStructInfo, TypeInfo, UnnamedField};
use bevy_reflect::utility::TypeInfoCell;

struct Foo<T: Reflect>(T);

impl<T: Reflect> Typed for Foo<T> {
  fn type_info() -> &'static TypeInfo {
    static CELL: TypeInfoCell = TypeInfoCell::generic();
    CELL.get_or_insert::<Self, _>(|| {
      let fields = [UnnamedField::new::<T>()];
      let info = TupleStructInfo::new::<Self>(&fields);
      TypeInfo::TupleStruct(info)
    })
  }
}
```

## Benefits

One major benefit is that this opens the door to other serialization methods. Since we can get all the type info at compile time, we can know how to properly deserialize something like:

```rust
#[derive(Reflect)]
struct MyType {
  foo: usize,
  bar: Vec<String>
}

// RON to be deserialized:
(
  type: "my_crate::MyType", // <- We now know how to deserialize the rest of this object
  value: {
    // "foo" is a value type matching "usize"
    "foo": 123,
    // "bar" is a list type matching "Vec<String>" with item type "String"
    "bar": ["a", "b", "c"]
  }
)
```

Not only is this more compact, but it has better compatibility (we can change the type of `"foo"` to `i32` without having to update our serialized data).

Of course, serialization/deserialization strategies like this may need to be discussed and fully considered before possibly making a change. However, we will be better equipped to do that now that we can access type information right from the registry.

## Discussion

Some items to discuss:

1. Duplication. There's a bit of overlap with the existing traits/structs since they require an instance of the type while the type info structs do not (for example, `Struct::field_at(&self, index: usize)` and `StructInfo::field_at(&self, index: usize)`, though only `StructInfo` is accessible without an instance object). Is this okay, or do we want to handle it in another way?
2. Should `TypeInfo::Dynamic` be removed? Since the dynamic types don't have type information available at runtime, we could consider them `TypeInfo::Value`s (or just even just `TypeInfo::Struct`). The intention with `TypeInfo::Dynamic` was to keep the distinction from these dynamic types and actual structs/values since users might incorrectly believe the methods of the dynamic type's info struct would map to some contained data (which isn't possible statically).
4. General usefulness of this change, including missing/unnecessary parts.
5. Possible changes to the scene format? (One possible issue with changing it like in the example above might be that we'd have to be careful when handling generic or trait object types.)

## Compile Tests

I ran a few tests to compare compile times (as suggested [here](#4042 (comment))). I toggled `Reflect` and `FromReflect` derive macros using `cfg_attr` for both this PR (aa5178e) and main (c309acd).

<details>
<summary>See More</summary>

The test project included 250 of the following structs (as well as a few other structs):

```rust
#[derive(Default)]
#[cfg_attr(feature = "reflect", derive(Reflect))]
#[cfg_attr(feature = "from_reflect", derive(FromReflect))]
pub struct Big001 {
    inventory: Inventory,
    foo: usize,
    bar: String,
    baz: ItemDescriptor,
    items: [Item; 20],
    hello: Option<String>,
    world: HashMap<i32, String>,
    okay: (isize, usize, /* wesize */),
    nope: ((String, String), (f32, f32)),
    blah: Cow<'static, str>,
}
```

> I don't know if the compiler can optimize all these duplicate structs away, but I think it's fine either way. We're comparing times, not finding the absolute worst-case time.

I only ran each build 3 times using `cargo build --timings` (thank you @devil-ira), each of which were preceeded by a `cargo clean --package bevy_reflect_compile_test`. 

Here are the times I got:

| Test                             | Test 1 | Test 2 | Test 3 | Average |
| -------------------------------- | ------ | ------ | ------ | ------- |
| Main                             | 1.7s   | 3.1s   | 1.9s   | 2.33s   |
| Main   `Reflect`                 | 8.3s   | 8.6s   | 8.1s   | 8.33s   |
| Main   `Reflect`   `FromReflect` | 11.6s  | 11.8s  | 13.8s  | 12.4s   |
| PR                               | 3.5s   | 1.8s   | 1.9s   | 2.4s    |
| PR   `Reflect`                   | 9.2s   | 8.8s   | 9.3s   | 9.1s    |
| PR   `Reflect`   `FromReflect`   | 12.9s  | 12.3s  | 12.5s  | 12.56s  |

</details>

---

## Future Work

Even though everything could probably be made `const`, we unfortunately can't. This is because `TypeId::of::<T>()` is not yet `const` (see rust-lang/rust#77125). When it does get stabilized, it would probably be worth coming back and making things `const`. 

Co-authored-by: MrGVSV <49806985 [email protected]>
@zakarumych
Copy link
Contributor

Why would anyone use transmute to extract integer value from TypeId?
Use Hash::hash to avoid unsafety :)

I know that Hash implementation for TypeId does not guarantee to emit inner value from TypeId unmodified, but transmuting opaque type is no better.

And why it is a problem anyway? I thought that transmuting arbitrary u64 back to TypeId is.

@eddyb
Copy link
Member

eddyb commented Jul 21, 2022

I'd say this is blocked on #99189 for now, if for no other reason than the "structural match" aspect.
(i.e. future-proofing TypeId by disallowing using TypeId constants as patterns and requiring ==)

@soqb
Copy link
Contributor

soqb commented Aug 20, 2022

to work around #97156 and #99189, surely we can just add a new const unsafe fn const_of() -> TypeId to TypeId or even a new ConstTypeId type that uses the v0 mangled type. it seems the only problem with making TypeId::of() const is that it must be either non-const and safe or const and unsafe and since making a method unsafe is a breaking change, we cannot make it const. it seems obvious to me just to add a new const unsafe fn method and/or type that can explicitly warn the user of its soundness holes and potentially deprecate TypeId::of().

@jmeggitt
Copy link

I'd say this is blocked on #99189 for now, if for no other reason than the "structural match" aspect. (i.e. future-proofing TypeId by disallowing using TypeId constants as patterns and requiring ==)

@eddyb What is the reasoning behind disallowing a structural match? While I can not name any specific reasons for its inclusion, I also can not think of any reasons to remove it either.

Take for example this code. It does not perform any complex pattern matching or require that a match statement be used, but it arguably makes the code a bit cleaner and easier to read than the alternative.

trait ConstTypeId {
    const ID: TypeId;
}

impl<T: 'static> ConstTypeId for T {
    const ID: TypeId = TypeId::of::<T>();
}

match foo.type_id {
    u16::ID | u32::ID | u64::ID => do_foo(&foo.data),
    Foo::ID | Bar::ID | Baz::ID => do_bar(&foo.data),
    _ => do_default(&foo.data),
}

This would generally serve the use case of "I should probably be using an enum, but there are a lot of variants and it is a pain to write it out". I'm partially joking, but there are some use cases where an enum can be more cumbersome such as when dealing with highly generic systems and associating resources with anonymous traits. That being said, I am unsure if those systems actually make use of use a match as to ==.

@eddyb
Copy link
Member

eddyb commented Aug 21, 2022

The problem is any future changes to TypeId that would require address-based comparison in a custom PartialEq implementation.

(See all previous discussion across this issue and my two closed PRs on the matter of non-trivial TypeId representations)

Honestly, our structural match rules are very loose and prone to bite us eventually.
What that is means that in principle no core/alloc/std type with private and unguaranteed internals should auto-derive PartialEq, Eq and provide a compile-time constructor, because that "locks in" the fact that the impls are auto-derived, i.e. that all fields themselves are Eq.

It's not a clear independent opt-in, for ergonomics reasons, but that also makes it forwards-incompatable to "let it happen".
(This applies to any crates but the standard library doesn't get to just release a new semver-incompat version, and editions offer only limited protection, and tend to have 3yr timeframes)

EDIT: oh, another note on structural match: I believe it was never an intentionally designed feature, just match falling back to (even overloaded!) ==, from pre-1.0 days, that we didn't remove in time, so we had to paper over it with whatever semantics let most existing code keep compiling (IIRC we only took a stance on floats and even that took many years).

james7132 pushed a commit to james7132/bevy that referenced this issue Oct 28, 2022
…evyengine#4042)

# Objective

> Resolves bevyengine#4504

It can be helpful to have access to type information without requiring an instance of that type. Especially for `Reflect`, a lot of the gathered type information is known at compile-time and should not necessarily require an instance.

## Solution

Created a dedicated `TypeInfo` enum to store static type information. All types that derive `Reflect` now also implement the newly created `Typed` trait:

```rust
pub trait Typed: Reflect {
  fn type_info() -> &'static TypeInfo;
}
```

> Note: This trait was made separate from `Reflect` due to `Sized` restrictions.

If you only have access to a `dyn Reflect`, just call `.get_type_info()` on it. This new trait method on `Reflect` should return the same value as if you had called it statically. 

If all you have is a `TypeId` or type name, you can get the `TypeInfo` directly from the registry using the `TypeRegistry::get_type_info` method (assuming it was registered).

### Usage

Below is an example of working with `TypeInfo`. As you can see, we don't have to generate an instance of `MyTupleStruct` in order to get this information.

```rust
#[derive(Reflect)]
struct MyTupleStruct(usize, i32, MyStruct);

let info = MyTupleStruct::type_info();
if let TypeInfo::TupleStruct(info) = info {
  assert!(info.is::<MyTupleStruct>());
  assert_eq!(std::any::type_name::<MyTupleStruct>(), info.type_name());
  assert!(info.field_at(1).unwrap().is::<i32>());
} else {
  panic!("Expected `TypeInfo::TupleStruct`");
}
```

### Manual Implementations

It's not recommended to manually implement `Typed` yourself, but if you must, you can use the `TypeInfoCell` to automatically create and manage the static `TypeInfo`s for you (which is very helpful for blanket/generic impls):

```rust
use bevy_reflect::{Reflect, TupleStructInfo, TypeInfo, UnnamedField};
use bevy_reflect::utility::TypeInfoCell;

struct Foo<T: Reflect>(T);

impl<T: Reflect> Typed for Foo<T> {
  fn type_info() -> &'static TypeInfo {
    static CELL: TypeInfoCell = TypeInfoCell::generic();
    CELL.get_or_insert::<Self, _>(|| {
      let fields = [UnnamedField::new::<T>()];
      let info = TupleStructInfo::new::<Self>(&fields);
      TypeInfo::TupleStruct(info)
    })
  }
}
```

## Benefits

One major benefit is that this opens the door to other serialization methods. Since we can get all the type info at compile time, we can know how to properly deserialize something like:

```rust
#[derive(Reflect)]
struct MyType {
  foo: usize,
  bar: Vec<String>
}

// RON to be deserialized:
(
  type: "my_crate::MyType", // <- We now know how to deserialize the rest of this object
  value: {
    // "foo" is a value type matching "usize"
    "foo": 123,
    // "bar" is a list type matching "Vec<String>" with item type "String"
    "bar": ["a", "b", "c"]
  }
)
```

Not only is this more compact, but it has better compatibility (we can change the type of `"foo"` to `i32` without having to update our serialized data).

Of course, serialization/deserialization strategies like this may need to be discussed and fully considered before possibly making a change. However, we will be better equipped to do that now that we can access type information right from the registry.

## Discussion

Some items to discuss:

1. Duplication. There's a bit of overlap with the existing traits/structs since they require an instance of the type while the type info structs do not (for example, `Struct::field_at(&self, index: usize)` and `StructInfo::field_at(&self, index: usize)`, though only `StructInfo` is accessible without an instance object). Is this okay, or do we want to handle it in another way?
2. Should `TypeInfo::Dynamic` be removed? Since the dynamic types don't have type information available at runtime, we could consider them `TypeInfo::Value`s (or just even just `TypeInfo::Struct`). The intention with `TypeInfo::Dynamic` was to keep the distinction from these dynamic types and actual structs/values since users might incorrectly believe the methods of the dynamic type's info struct would map to some contained data (which isn't possible statically).
4. General usefulness of this change, including missing/unnecessary parts.
5. Possible changes to the scene format? (One possible issue with changing it like in the example above might be that we'd have to be careful when handling generic or trait object types.)

## Compile Tests

I ran a few tests to compare compile times (as suggested [here](bevyengine#4042 (comment))). I toggled `Reflect` and `FromReflect` derive macros using `cfg_attr` for both this PR (aa5178e) and main (c309acd).

<details>
<summary>See More</summary>

The test project included 250 of the following structs (as well as a few other structs):

```rust
#[derive(Default)]
#[cfg_attr(feature = "reflect", derive(Reflect))]
#[cfg_attr(feature = "from_reflect", derive(FromReflect))]
pub struct Big001 {
    inventory: Inventory,
    foo: usize,
    bar: String,
    baz: ItemDescriptor,
    items: [Item; 20],
    hello: Option<String>,
    world: HashMap<i32, String>,
    okay: (isize, usize, /* wesize */),
    nope: ((String, String), (f32, f32)),
    blah: Cow<'static, str>,
}
```

> I don't know if the compiler can optimize all these duplicate structs away, but I think it's fine either way. We're comparing times, not finding the absolute worst-case time.

I only ran each build 3 times using `cargo build --timings` (thank you @devil-ira), each of which were preceeded by a `cargo clean --package bevy_reflect_compile_test`. 

Here are the times I got:

| Test                             | Test 1 | Test 2 | Test 3 | Average |
| -------------------------------- | ------ | ------ | ------ | ------- |
| Main                             | 1.7s   | 3.1s   | 1.9s   | 2.33s   |
| Main   `Reflect`                 | 8.3s   | 8.6s   | 8.1s   | 8.33s   |
| Main   `Reflect`   `FromReflect` | 11.6s  | 11.8s  | 13.8s  | 12.4s   |
| PR                               | 3.5s   | 1.8s   | 1.9s   | 2.4s    |
| PR   `Reflect`                   | 9.2s   | 8.8s   | 9.3s   | 9.1s    |
| PR   `Reflect`   `FromReflect`   | 12.9s  | 12.3s  | 12.5s  | 12.56s  |

</details>

---

## Future Work

Even though everything could probably be made `const`, we unfortunately can't. This is because `TypeId::of::<T>()` is not yet `const` (see rust-lang/rust#77125). When it does get stabilized, it would probably be worth coming back and making things `const`. 

Co-authored-by: MrGVSV <49806985 [email protected]>
ItsDoot pushed a commit to ItsDoot/bevy that referenced this issue Feb 1, 2023
…evyengine#4042)

# Objective

> Resolves bevyengine#4504

It can be helpful to have access to type information without requiring an instance of that type. Especially for `Reflect`, a lot of the gathered type information is known at compile-time and should not necessarily require an instance.

## Solution

Created a dedicated `TypeInfo` enum to store static type information. All types that derive `Reflect` now also implement the newly created `Typed` trait:

```rust
pub trait Typed: Reflect {
  fn type_info() -> &'static TypeInfo;
}
```

> Note: This trait was made separate from `Reflect` due to `Sized` restrictions.

If you only have access to a `dyn Reflect`, just call `.get_type_info()` on it. This new trait method on `Reflect` should return the same value as if you had called it statically. 

If all you have is a `TypeId` or type name, you can get the `TypeInfo` directly from the registry using the `TypeRegistry::get_type_info` method (assuming it was registered).

### Usage

Below is an example of working with `TypeInfo`. As you can see, we don't have to generate an instance of `MyTupleStruct` in order to get this information.

```rust
#[derive(Reflect)]
struct MyTupleStruct(usize, i32, MyStruct);

let info = MyTupleStruct::type_info();
if let TypeInfo::TupleStruct(info) = info {
  assert!(info.is::<MyTupleStruct>());
  assert_eq!(std::any::type_name::<MyTupleStruct>(), info.type_name());
  assert!(info.field_at(1).unwrap().is::<i32>());
} else {
  panic!("Expected `TypeInfo::TupleStruct`");
}
```

### Manual Implementations

It's not recommended to manually implement `Typed` yourself, but if you must, you can use the `TypeInfoCell` to automatically create and manage the static `TypeInfo`s for you (which is very helpful for blanket/generic impls):

```rust
use bevy_reflect::{Reflect, TupleStructInfo, TypeInfo, UnnamedField};
use bevy_reflect::utility::TypeInfoCell;

struct Foo<T: Reflect>(T);

impl<T: Reflect> Typed for Foo<T> {
  fn type_info() -> &'static TypeInfo {
    static CELL: TypeInfoCell = TypeInfoCell::generic();
    CELL.get_or_insert::<Self, _>(|| {
      let fields = [UnnamedField::new::<T>()];
      let info = TupleStructInfo::new::<Self>(&fields);
      TypeInfo::TupleStruct(info)
    })
  }
}
```

## Benefits

One major benefit is that this opens the door to other serialization methods. Since we can get all the type info at compile time, we can know how to properly deserialize something like:

```rust
#[derive(Reflect)]
struct MyType {
  foo: usize,
  bar: Vec<String>
}

// RON to be deserialized:
(
  type: "my_crate::MyType", // <- We now know how to deserialize the rest of this object
  value: {
    // "foo" is a value type matching "usize"
    "foo": 123,
    // "bar" is a list type matching "Vec<String>" with item type "String"
    "bar": ["a", "b", "c"]
  }
)
```

Not only is this more compact, but it has better compatibility (we can change the type of `"foo"` to `i32` without having to update our serialized data).

Of course, serialization/deserialization strategies like this may need to be discussed and fully considered before possibly making a change. However, we will be better equipped to do that now that we can access type information right from the registry.

## Discussion

Some items to discuss:

1. Duplication. There's a bit of overlap with the existing traits/structs since they require an instance of the type while the type info structs do not (for example, `Struct::field_at(&self, index: usize)` and `StructInfo::field_at(&self, index: usize)`, though only `StructInfo` is accessible without an instance object). Is this okay, or do we want to handle it in another way?
2. Should `TypeInfo::Dynamic` be removed? Since the dynamic types don't have type information available at runtime, we could consider them `TypeInfo::Value`s (or just even just `TypeInfo::Struct`). The intention with `TypeInfo::Dynamic` was to keep the distinction from these dynamic types and actual structs/values since users might incorrectly believe the methods of the dynamic type's info struct would map to some contained data (which isn't possible statically).
4. General usefulness of this change, including missing/unnecessary parts.
5. Possible changes to the scene format? (One possible issue with changing it like in the example above might be that we'd have to be careful when handling generic or trait object types.)

## Compile Tests

I ran a few tests to compare compile times (as suggested [here](bevyengine#4042 (comment))). I toggled `Reflect` and `FromReflect` derive macros using `cfg_attr` for both this PR (aa5178e) and main (c309acd).

<details>
<summary>See More</summary>

The test project included 250 of the following structs (as well as a few other structs):

```rust
#[derive(Default)]
#[cfg_attr(feature = "reflect", derive(Reflect))]
#[cfg_attr(feature = "from_reflect", derive(FromReflect))]
pub struct Big001 {
    inventory: Inventory,
    foo: usize,
    bar: String,
    baz: ItemDescriptor,
    items: [Item; 20],
    hello: Option<String>,
    world: HashMap<i32, String>,
    okay: (isize, usize, /* wesize */),
    nope: ((String, String), (f32, f32)),
    blah: Cow<'static, str>,
}
```

> I don't know if the compiler can optimize all these duplicate structs away, but I think it's fine either way. We're comparing times, not finding the absolute worst-case time.

I only ran each build 3 times using `cargo build --timings` (thank you @devil-ira), each of which were preceeded by a `cargo clean --package bevy_reflect_compile_test`. 

Here are the times I got:

| Test                             | Test 1 | Test 2 | Test 3 | Average |
| -------------------------------- | ------ | ------ | ------ | ------- |
| Main                             | 1.7s   | 3.1s   | 1.9s   | 2.33s   |
| Main   `Reflect`                 | 8.3s   | 8.6s   | 8.1s   | 8.33s   |
| Main   `Reflect`   `FromReflect` | 11.6s  | 11.8s  | 13.8s  | 12.4s   |
| PR                               | 3.5s   | 1.8s   | 1.9s   | 2.4s    |
| PR   `Reflect`                   | 9.2s   | 8.8s   | 9.3s   | 9.1s    |
| PR   `Reflect`   `FromReflect`   | 12.9s  | 12.3s  | 12.5s  | 12.56s  |

</details>

---

## Future Work

Even though everything could probably be made `const`, we unfortunately can't. This is because `TypeId::of::<T>()` is not yet `const` (see rust-lang/rust#77125). When it does get stabilized, it would probably be worth coming back and making things `const`. 

Co-authored-by: MrGVSV <49806985 [email protected]>
@yshui
Copy link
Contributor

yshui commented Mar 3, 2023

This would generally serve the use case of "I should probably be using an enum, but there are a lot of variants and it is a pain to write it out".
but there are some use cases where an enum can be more cumbersome such as when dealing with highly generic systems and associating resources with anonymous traits.

I actually am just writing such a system, and that's how I found out about this issue. And given the prospect that structural matching of TypeId could be gone in the future, I ended up using a set of macros to generate a big enum and a set of helper function to make working with it easier.

Not arguing for keeping it, just want to provide a data point.

@benkay86
Copy link

It seems const fn type_id() is going to take a long time to stabilize due to many concerns about soundness and corner cases of people doing something with the returned TypeId other than testing equality against another TypeId. Would it be possible instead to make Any::is<T>(&self) const? This way all the internals of TypeId stay inside the standard library.

@KodrAus
Copy link
Contributor Author

KodrAus commented Apr 12, 2023

I think a const-enabled Any::is that hid the specific equality mechanism would satisfy my use-case for const type ids too.

@eddyb did we have any lingering objection to being able to do this kind of type inspection at all in CTFE, or was it just the issue of structural match on TypeId that was the blocker?

@zakarumych
Copy link
Contributor

I guess that most of the other use-cases would be covered by allowing TypeId::of::<T>() in const context, without allowing to do anything with it but assigning to a constant.

@Pr0methean
Copy link

Pr0methean commented May 27, 2023

Looks like TypeId suddenly stopped implementing StructuralEq. Why? I was previously able to use this function without a problem:

const PIXMAP_TYPE_ID: TypeId = TypeId::of::<Pixmap>();
const MASK_TYPE_ID: TypeId = TypeId::of::<Mask>();
const VEC_U8_TYPE_ID: TypeId = TypeId::of::<Vec<u8>>();

fn type_name_for_id(id: TypeId) -> &'static str {
    match id {
        PIXMAP_TYPE_ID => "Pixmap",
        MASK_TYPE_ID => "Mask",
        VEC_U8_TYPE_ID => "Vec<u8>",
        _ => "Unknown"
    }
}

@madsmtm
Copy link
Contributor

madsmtm commented May 27, 2023

Removing structural match for TypeId was done in #103291 in part to unblock exactly this feature, see the discussion above for reasoning behind the removal (shortened in #77125 (comment)).

@Pr0methean
Copy link

Would it be possible to restore backward compatibility by having the compiler convert a match with TypeId constants into an else if chain when there were only a handful of branches, and a lookup in a const std::collections::HashMap otherwise?

@oli-obk
Copy link
Contributor

oli-obk commented May 28, 2023

Yes, but that will require stabilization of const_trait_impls first, as otherwise we can't call PartialEq methods inside the const fn.

@KodrAus
Copy link
Contributor Author

KodrAus commented May 28, 2023

Most discussion about type ids in const contexts is happening here (naturally, since there's nothing to do with a type id except compare them), but there's also #101871 tracking using == in const contexts.

bmisiak added a commit to bmisiak/matrix-rust-sdk that referenced this issue May 31, 2023
Previous versions of log used an alpha version of
value-bag, which broke on the new nightly rustc:
rust-lang/rust#77125
bmisiak added a commit to bmisiak/matrix-rust-sdk that referenced this issue May 31, 2023
Previous versions of log used an alpha version of
value-bag, which broke on the new nightly rustc:
rust-lang/rust#77125
@briansmith
Copy link
Contributor

I think #10389 needs to be resolved first, becuase depending on the resolution, it may not be possible to calculate the TypeId at compile time; it might end up like C std::type_index that uses the address of the RTTI structure as the ID.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 22, 2023

We can compute and equate type ids that are addresses just fine at compile time. You just can't inspect the address (so no sorting type ids, no hashing them, no printing them)

@tgross35
Copy link
Contributor

tgross35 commented Mar 4, 2024

blocking the stabilization of this on #97156

This was recently marked fixed with #118247

Edit: started a discussion https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Const.20.60TypeId.3A.3Aof.60.2C.20.60TypeId.3A.3Amatches.60.2C.20.60type_name.60.2C

Edit2: From the discussion, this is still blocked by the possibility of collision #10389

@oli-obk oli-obk removed the S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. label Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-fn Area: const fn foo(..) {..}. Pure functions which can be applied at compile time. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. S-tracking-impl-incomplete Status: The implementation is incomplete. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests