Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How should const generics with references work around pointer identity and padding? #120961

Open
RalfJung opened this issue Feb 12, 2024 · 15 comments · Fixed by #127722
Open

How should const generics with references work around pointer identity and padding? #120961

RalfJung opened this issue Feb 12, 2024 · 15 comments · Fixed by #127722
Labels
A-const-eval Area: constant evaluation (mir interpretation) F-unsized_const_params `#![feature(unsized_const_params)]` T-opsem Relevant to the opsem team

Comments

@RalfJung
Copy link
Member

RalfJung commented Feb 12, 2024

Const generics currently completely deconstruct the value into a valtree and then reconstruct it, which destroys pointer identity. As long as this is just about purely immutable const values we might be able to argue that they "don't have pointer identity" (though it is entirely unclear to me what exactly that should mean). But if consts can point to statics, that doesn't really work any more.

Example:

#![feature(const_refs_to_static)]
#![feature(adt_const_params)]

static FOO: usize = 42;
const BAR: &usize = &FOO;
fn foo<const X: &'static usize>() {
    if std::ptr::eq(X, &FOO) {
        println!("activating special mode");
    }
}

fn main() {
    foo::<BAR>();
}

Arguably this should print, but currently it does not.

Cc #95174, #119618

Similarly, this example demonstrates that padding is lost when passing an &(u8, u16) as a const generic, which is not how passing a reference usually behaves:

#![feature(const_refs_to_static)]
#![feature(adt_const_params)]

static FOO: u32 = 0x01020304;
const BAR: &(u8, u16) = unsafe { std::mem::transmute(&FOO) };

fn foo<const X: &'static (u8, u16)>() {
    let val: &u32 = unsafe { std::mem::transmute(X) };
    println!("{val:#x}"); // prints 0x1020004
}

fn main() {
    foo::<BAR>();
}
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 12, 2024
@RalfJung RalfJung added the F-adt_const_params `#![feature(adt_const_params)]` label Feb 12, 2024
@fmease fmease added A-const-eval Area: constant evaluation (mir interpretation) T-opsem Relevant to the opsem team and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 12, 2024
@oli-obk
Copy link
Contributor

oli-obk commented Feb 13, 2024

We could encode relocations to statics as a separate valtree variant, but I'm not sure how we can handle offsets into statics properly.

I also am not sure how weird things get if we do make references to statics ignore what's behind the reference and use the identity of the static instead.

@RalfJung
Copy link
Member Author

I also am not sure how weird things get if we do make references to statics ignore what's behind the reference and use the identity of the static instead.

Depends on what you call "weird". ;)
Are the following two constants "equal" when used as generics or not?

static S1: i32 = &42;
static S2: i32 = &42;

const C1: &i32 = &S1;
const C2: &i32 = &S2;

C1 and C2 would compare equal with ==, but inequal with ptr::eq.

@cjgillot
Copy link
Contributor

We could encode relocations to statics as a separate valtree variant, but I'm not sure how we can handle offsets into statics properly.

The variant would have to be some kind of Static { def_id, offset }. I don't really see the issue.

I also am not sure how weird things get if we do make references to statics ignore what's behind the reference and use the identity of the static instead.

That would be more consistent with the distinction between const (values), and static (places).

@cjgillot
Copy link
Contributor

The variant would have to be some kind of Static { def_id, offset }. I don't really see the issue.

Actually, stuff gets a bit weirder with promoted from statics. Those would have to be handled like statics, adding a promoted field.

@RalfJung
Copy link
Member Author

That would be more consistent with the distinction between const (values), and static (places).

Sure, but it would mean that ==-equal values could still be considered "different" for const generic purposes. This is something we specifically avoided when it comes to e.g. padding in structs.

@jbms

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@jbms

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 14, 2024

FWIW I think this is not just about pointer identity, it is in general about all aspects of the const that are lost in valtree conversion. So besides pointer identity this also affects padding, where passing a const as a const generic will reset its padding -- which makes sense for the by-value part of the const (copying a value does not copy the padding), but is somewhat odd for the part of the const that lives behind a reference (where usually padding is guaranteed to be preserved).

Here is an example of that.

So I am in general not convinced that passing constants involving references as generics is coherent.

(A similar example could be constructed showing that provenance is lost in the conversion, if an &usize points to memory that stores a pointer.)

@RalfJung RalfJung changed the title How should pointer identity of references in const generics work? How should const generics with references work around pointer identity and padding? Jun 14, 2024
@BoxyUwU
Copy link
Member

BoxyUwU commented Jun 26, 2024

The fact that &usize erases provenance in the pointee seems like a pretty big point actually. For padding and refs_to_statics we could conceivably track this in ValTree and have it enter into the type system and affect monomorphization and name mangling etc. I'm not sure how we would even do that for provenance since we would have to pick an encoding with a notion of equality that will not change ever or else it could force types to become unequal...

If we were to go down the path of "const generics should not re-serialize data behind a references" then it seems like this is effectively representing references as pointing to arbitrary bytes. As far as I know this is accurate to how references are actually thought about by the opsem people (@RalfJung is that right?).

Currently we forbid MaybeUninit and unions as the types of const parameters, however if we start representing references as just "pointing to arbitrary bytes" then we ought to support unions too otherwise references are effectively just a backdoor to allow using MaybeUninit<Foo> via &'static [u8; N]?

So all in all I guess we have the options of:

  • Forbid references in const generics and continue to not support "arbitrary bytes"
  • Allow "arbitrary bytes" to be represented by const generics (and then use that for the pointee of references)
  • Error on the ConstValue -> ValTree conversion for const generics when it would lose information resulting in any code that would exercise the "weirdness" around borrows to be invalid code. This is forwards compatible with potentially deciding that we do want to do this re-serialization of the pointee. It also should not rule out any possibilities for equality of type level constants.

@RalfJung do you have thoughts on whether it is even physically possible for us to be able to track provenance in ValTree in a way that the equality of provenance of a byte is visible in the type system, without it causing backwards compatibility issues.

Regardless of that I think what I would like to do here is the following:

  • Forbid references in const parameter types
  • stabilize adt_const_params without references
  • If we really want to support unsized types in const parameter types and the best way we can think of to do this is via references, then error on the ConstValue -> ValTree conversion when encountering "tricky" references (e.g. pointees with initialized padding, refs to statics)
  • stabilize references in const parameters

@BoxyUwU
Copy link
Member

BoxyUwU commented Jun 26, 2024

It's also worth mentioning that if we support references in const parameters' types then we need some way of mangling them.

@RalfJung
Copy link
Member Author

The main concern isn't really about provenance, it's about the answer to ptr::eq calls. Provenance could also come in but so far the examples we discussed are not relying on provenance. (Things are a bit confusing because we use the "provenance" type of the interpreter to track the AllocId and in const-eval this actually affects ptr::eq answers, but conceptually that is a const-eval implementation detail.)

If we were to go down the path of "const generics should not re-serialize data behind a references" then it seems like this is effectively representing references as pointing to arbitrary bytes. As far as I know this is accurate to how references are actually thought about by the opsem people (@RalfJung is that right?).

Yeah, in the opsem a reference is the same as a raw pointer -- it is an (absolute) address and provenance.

Currently we forbid MaybeUninit and unions as the types of const parameters, however if we start representing references as just "pointing to arbitrary bytes" then we ought to support unions too otherwise references are effectively just a backdoor to allow using MaybeUninit via &'static [u8; N]?

Yeah, sounds right.

Allow "arbitrary bytes" to be represented by const generics (and then use that for the pointee of references)

Even representing the "arbitrary bytes" of the pointee of a reference does not suffice, since it can still change what ptr::eq does. The example in the issue description needs the identity of the reference to be preserved -- its AllocId. However, AllocId is not a stable identifier, it exists only within an allocation session, so it is not really suited for type system use. The actually stable property is "AllocId up to an arbitrary bijection", but it is hard to have a canonical encoding for that. For the special case of references pointing to static, we do have a stable identifier -- the DefId of the static. But for everything else, I am not sure what to do.

Regardless of that I think what I would like to do here is the following:

Yes that sounds good to me.
The one decision we have to make is whether we are okay with padding being erased in the by-value part of a const generic. In my eyes this makes sense since padding is also erased when passing an argument by-value to a function the normal way.

@BoxyUwU
Copy link
Member

BoxyUwU commented Jul 21, 2024

I didn't meant to auto close this... still an issue for feature(unsized_const_params)

@BoxyUwU BoxyUwU reopened this Jul 21, 2024
@BoxyUwU BoxyUwU added F-unsized_const_params `#![feature(unsized_const_params)]` and removed F-adt_const_params `#![feature(adt_const_params)]` labels Jul 21, 2024
@BoxyUwU
Copy link
Member

BoxyUwU commented Aug 30, 2024

Interesting quirk:

#![feature(adt_const_params, unsized_const_params, const_refs_to_static)]
#![allow(incomplete_features)]

use std::marker::{ConstParamTy_, UnsizedConstParamTy};

static FOO: &'static Foo = unsafe { const { &std::mem::transmute::<[u8; 4], Foo>([1, 0, 1, 0]) } };
static BAR: &'static Foo = unsafe { const { &std::mem::transmute::<[u8; 4], Foo>([1, 1, 1, 0]) } };

#[repr(C)]
#[derive(Debug, UnsizedConstParamTy, Eq, PartialEq)]
struct Foo(u8, u16);

impl ConstParamTy_ for Foo {}

fn erase<const S1: &'static Foo, const S2: &'static Foo>() {
    assert!(std::ptr::eq(S1, S2));
}

fn main() {
    assert!("{}", !std::ptr::eq(FOO, BAR));
    erase::<FOO, BAR>();
}

This currently runs just fine, erasing padding means that the Valtree's for S1 and S2 are equal, they then get the same value generated for them.

On a related note the following currently runs fine even though maybe it shouldn't since we aren't trying to track addresses of references:

#![feature(adt_const_params, unsized_const_params, const_refs_to_static)]
#![allow(incomplete_features)]

static FOO: &'static u32 = &10;

fn erase<const S1: &'static u32, const S2: &'static u32>() {
    assert!(std::ptr::eq(S1, S2));
}

fn main() {
    erase::<FOO, FOO>();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-eval Area: constant evaluation (mir interpretation) F-unsized_const_params `#![feature(unsized_const_params)]` T-opsem Relevant to the opsem team
Projects
None yet
7 participants