Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider deprecation of UB-happy static mut #53639

Open
eddyb opened this issue Aug 23, 2018 · 197 comments
Open

Consider deprecation of UB-happy static mut #53639

eddyb opened this issue Aug 23, 2018 · 197 comments
Labels
A-maybe-future-edition Something we may consider for a future edition. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such needs-rfc This change is large or controversial enough that it should have an RFC accepted before doing it. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@eddyb
Copy link
Member

eddyb commented Aug 23, 2018

static mut is almost impossible to use correctly, see rust-lang-nursery/lazy-static.rs#117 for an example in the widely-used lazy-static.

You must be able to show that every borrow of the static mut is not reentrant (as opposed to regular interior mutability, which only requires reentrance-freedom when accessing data), which is almost entirely impossible in real-world scenarios.


We have a chance at removing it from Rust2018 and force people to use a proper synchronization abstraction (e.g. lazy_static! Mutex), or in lieu of one, thread_local! / scoped_thread_local!.

If they were using static mut with custom synchronization logic, they should do this:

pub struct CustomSynchronizingAbstraction<T> {
    /* UnsafeCell / Cell / RefCell / etc. around data, e.g. `T` */
}
// Promise that proper synchronization exists *around accesses*.
unsafe impl<T: Sync> Sync for CustomSynchronizingAbstraction<T> {}

And then use CustomSynchronizingAbstraction with regular statics, safely.

This matches the "soundness boundary" of Rust APIs, whereas static mut is more like C.

cc @RalfJung @rust-lang/compiler @rust-lang/lang


2023-08 Note from triage, many years later: there's now https://doc.rust-lang.org/1.71.0/std/cell/struct.SyncUnsafeCell.html in the library, added by #95438, which can be used instead of static mut. But static mut is not even soft-deprecated currently.

@eddyb eddyb added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Aug 23, 2018
@Centril
Copy link
Contributor

Centril commented Aug 23, 2018

I don't know at all whether we should do this or not atm. But to make this decision I have an...

...Idea: We should do a crater run to scrutinize how many legitimate usages there are and how many ones there are that have UB.

@nikomatsakis
Copy link
Contributor

cc @rust-lang/wg-unsafe-code-guidelines -- we have this group just for this sort of thing =)

@alercah
Copy link
Contributor

alercah commented Aug 23, 2018

I'm 100% behind getting rid of static mut. Any attempt to use it which isn't UB is replaceable with something else which is safe, likely a Mutex or something thread local. Or, per the original comment, a custom alternate type implementing Sync directly.

@joshtriplett
Copy link
Member

We should clearly document the undefined behavior, but I don't think we should remove the ability to have (unsafe) global mutable locations directly, without any layer of indirection. That would make it more difficult to build certain types of lock-free synchronization, for instance.

@eddyb
Copy link
Member Author

eddyb commented Aug 23, 2018

@joshtriplett But you can always use unsafe impl Sync static?

@alercah
Copy link
Contributor

alercah commented Aug 23, 2018

Can you give an example of what you mean? You can't do much more with static mut than you could with a static UnsafeCell.

@RalfJung
Copy link
Member

But you can always use unsafe impl Sync static?

Yeah, and that's just as unsafe. So TBH I do not see the point.

@eddyb
Copy link
Member Author

eddyb commented Aug 23, 2018

@RalfJung No, it's not, and I've explained why. With unsafe impl Sync, you only have to prove the data accesses correct, but with static mut, the references themselves can conflict.

EDIT: Now if "references exist" cannot be UB, ever, then this is not a problem, but IIUC, they do.
Also, you can't use a static mut from safe code, only a static. If the static mut is used correctly, why spread unsafety when you can wrap it in a sound API.

@alercah
Copy link
Contributor

alercah commented Aug 23, 2018

It's not actually different from an UnsafeCell if we replace each &var with unsafe {&*var.get()}, right? Then the aliasing rules would be the same; the difference is that with the UnsafeCell you can still keep references to the cell itself?

@RalfJung
Copy link
Member

@eddyb Ah okay I see -- that has nothing to with with unsafe impl tough and everything with &UnsafeCell.

I agree &UnsafeCell is safer than static mut.

@eddyb
Copy link
Member Author

eddyb commented Aug 23, 2018

@RalfJung Okay I should make it clearer that the unsafe impl is for "custom synchronization abstraction". I'll go edit the issue description.

@RalfJung
Copy link
Member

If we could weaken static mut to only give you raw pointers, that could help...

... tough people would probably still turn them into references ASAP.

@japaric
Copy link
Member

japaric commented Aug 23, 2018

If they were using static mut with custom synchronization logic, they should do this:

How would they instantiate their custom type in stable Rust? User cons fns are unstable and even if we do stabilize min_const_fn that doesn't include anything that has bounds so even with that your example won't compile on stable.

To me this sounds like it would reduce what you can do in the 2018 edition. In the 2015 edition you can create static mut variables that contain primitive types but in the 2018 edition you can not do the equivalent unless I'm missing something like adding a RacyCell type with const constructor to core. (Here I'm assuming that it's not certain whether min_const_fn will make it into the edition release).

@eddyb
Copy link
Member Author

eddyb commented Aug 23, 2018

@japaric Okay, that's a good point. I removed the bound from the struct definition, does that help? If you use it with a T: !Sync type you just can't put it in a static, but it should be fine otherwise.

Note that you can use an associated constant if you don't need arguments in your constructor (e.g. lazy-static already uses Lazy::<T>::INIT for this pattern).
Otherwise, yes, the min_const_fn requirement is a problem.

Depending on what you're doing you might be able to make your fields public, but as was in the case of lazy-static, that's not always the case.

cc @Centril @oli-obk Do we want to force people to replace their "custom synchronization abstractions" involving static muts with ones that require const fn?

In the 2015 edition you can create static mut variables that contain primitive types

What kinds of types? Atomics with relaxed ordering should probably be preferred either way even in single-threaded code, unless you have some sort of proof that interrupts are disabled.
(I'm assuming that you just can't trust anything to be single-threaded in user-space, in a library, and in kernel-space / embedded you have interrupts to worry about, wrt reentrance-safety)

like adding a RacyCell type with const constructor to core

Hmm, RacyCell<PubliclyConstructableType> would be okay, I think, you'd just have to implement methods on it (so you'd wrap it in another publicly-constructible-type, or use an extension trait).

Given pub struct Abstraction(pub RacyCell<AbstractionData>);, I think that &'static Abstraction is significantly safer than static mut but only if it gives you a *mut AbstractionData, not a &mut.

I wish the min_const_fn idea popped up sooner and we stabilized it already 😢.
Another 3 years of randomly finding unsound static mut uses doesn't sound fun.

@eddyb
Copy link
Member Author

eddyb commented Aug 23, 2018

@rfcbot poll @rust-lang/libs @rust-lang/lang Should we add RacyUnsafeCell?

#[repr(transparent)]
pub struct RacyUnsafeCell<T>(UnsafeCell<T>);

unsafe impl<T: Sync> Sync for RacyUnsafeCell<T> {}

impl<T> RacyUnsafeCell<T> {
    pub const fn new(x: T) -> Self {
        RacyUnsafeCell(UnsafeCell::new(x))
    }
    pub fn get(&self) -> *mut T {
        self.0.get()
    }
}

@rfcbot
Copy link

rfcbot commented Aug 23, 2018

Team member @eddyb has asked teams: T-lang, T-libs, for consensus on:

Should we add RacyUnsafeCell?

@eddyb
Copy link
Member Author

eddyb commented Aug 23, 2018

@RalfJung What we want is &'static Abstraction, I think, and *mut only in methods of it.

That is, the user of the abstraction should be outside of the "unsafe zone" and be able to use only safe code to interact with the abstraction, otherwise it's not really a good abstraction.

@japaric
Copy link
Member

japaric commented Aug 23, 2018

@eddyb

In embedded we use static mut variables with interrupt handlers. These
handlers are invoked by the hardware and non-reentrant by hardware design (while
executing an interrupt handler the same handler will not be invoked again if
the interrupt signal is received. Also the devices are single core). So we have a
pattern like to let users add state to their interrupt handlers:

// we generate this function using macros
#[export_name = "SOME_KNOWN_NAME_TO_HAVE_THE_LINKER_PUT_THIS_IN_THE_RIGHT_PLACE"]
pub unsafe extern "C" fn interrupt_handler_that_has_some_random_name_unknown_to_the_user() {
    static mut STATE: usize = 0; // initial value provided by the user

    on_button_pressed(&mut STATE);
}

// the user provides this function along with the initial value for the state
fn on_button_pressed(count: &mut usize) {
    *count  = 1;
    println!("Button has been pressed {} times", *count);
}

This way the end user indirectly uses static mut variables in a non-reentrant
fashion. The user can't call the interrupt handler themselves because the name
of the function is an implementation detail and if they call on_button_pressed
themselves there's still no problem because that won't operate on the hidden
static mut variable.

As there are no newtypes involved the user can use primitives and types
the define themselves (e.g. structs with pub(crate) fields) without requiring
the unstable const fn feature.

I'm not against this feature as long as the above pattern continues to work on stable Rust 2018.

@scottmcm
Copy link
Member

If there's an API that @japaric and libs are happy with, I'm happy to remove static mut.

@nikomatsakis
Copy link
Contributor

Personally, I think we should just deprecate static mut across the board, but not remove it from Rust 2018. We don't gain much by making it a hard removal, really -- we don't want to "make space" for some other meaning.

@eddyb
Copy link
Member Author

eddyb commented Aug 23, 2018

@japaric Ah, if you're doing code generation, you can have your own version of RacyUnsafeCell with a public field and no const fn constructor, would that work for you?
UnsafeCell being !Sync isn't necessary, but it does avoid forgetting to impl !Sync.


There's a more powerful pattern I prefer using in those situations:

// Lifetime-invariant ZST token. Probably doesn't even need the invariance.
#[repr(transparent)]
#[derive(Copy, Clone)]
pub struct InterruptsDisabled<'a>(PhantomData<&'a mut &'a ()>);

// Public field type to get around the lack of stable `const fn`.
pub struct InterruptLock<T: ?Sized   Send>(pub UnsafeCell<T>);

// AFAIK this should behave like Mutex.
unsafe impl<T: ?Sized   Send> Sync for InterruptLock<T> {}

impl<T: ?Sized   Send> InterruptLock<T> {
    // This gives access to the `T` that's `Send` but maybe not `!Sync`,
    // for *at most* the duration that the interrupts are disabled for.
    // Note: I wanted to implement `Index` but that can't constrain lifetimes.
    fn get<'a>(&'a self, _: InterruptsDisabled<'a>) -> &'a T {
        unsafe { &*self.0.get() }
    }
}

// Note that you bake any of these into `InterruptLock` without `const fn`,
// because you would need a private field to ensure nobody can touch it.
// OTOH, `UnsafeCell<T>` makes *only constructing* the field public.
pub type InterruptCell<T: ?Sized   Send> = InterruptLock<Cell<T>>;
pub type InterruptRefCell<T: ?Sized   Send> = InterruptLock<RefCell<T>>;
#[export_name = "SOME_KNOWN_NAME_TO_HAVE_THE_LINKER_PUT_THIS_IN_THE_RIGHT_PLACE"]
pub unsafe extern "C" fn interrupt_handler_that_has_some_random_name_unknown_to_the_user(
    token: InterruptsDisabled,
) {
    on_button_pressed(token);
}

static COUNT: InterruptCell<usize> = InterruptLock(UnsafeCell::new(
    Cell::new(0),
));
fn on_button_pressed(token: InterruptsDisabled) {
    let count = COUNT.get(token);
    count.set(count.get()   1);
    println!("Button has been pressed {} times", count.get());
}

There's another variation on this, where you make the token non-Copy and can do:

#[export_name = "SOME_KNOWN_NAME_TO_HAVE_THE_LINKER_PUT_THIS_IN_THE_RIGHT_PLACE"]
pub unsafe extern "C" fn interrupt_handler_that_has_some_random_name_unknown_to_the_user(
    token: InterruptsDisabled,
) {
    on_button_pressed(token);
}

static COUNT: InterruptLock<usize> = InterruptLock(UnsafeCell::new(
    0,
));
fn on_button_pressed(mut token: InterruptsDisabled) {
    // This mutable borrow of `token` prevents using it for
    // accessing any `InterruptLock`s, including `COUNT` itself.
    // The previous version was like `&token` from this one.
    let count = COUNT.get_mut(&mut token);
    *count  = 1;
    println!("Button has been pressed {} times", *count);
}

(I didn't write the implementation details for the second version for brevity's sake, if desired I can edit this later)

You can even allow creating a InterruptsDisabled out of a &mut InterruptsDisabled, effectively sub-borrowing it (assuming you want to keep it ZST so it stays zero-cost).
Then you can combine the two versions by allowing the creation of the first version from a &InterruptsDisabled from the second version (but, again, only to keep it ZST).

Theoretically you can even create a InterruptsDisabled using HRTB and calling a closure after disabling interrupts, but this only works for the first version, not the second, stronger, one.
Also, IIRC, on some platforms it's impossible to truly turn all interrupts off.


IMO a safe abstraction like this is a good long-term investment, since it can handle any sort of Send !Sync data that's accessed only from within interrupts.

cc @RalfJung Has anything like this been proven safe?

@Nemo157
Copy link
Member

Nemo157 commented Aug 23, 2018

@eddyb note that the non-reentrancy that @japaric describes is for that particular interrupt handler (and any lower priority, but that’s hard to do anything useful with). Interrupts are not globally disabled during an interrupt handler on architectures like ARMv6-M. It might be possible to use a more complicated token scheme where each interrupt has its own token type, but I don’t know if anyone has looked into some way to make that actually usable.

@eddyb
Copy link
Member Author

eddyb commented Aug 23, 2018

@Nemo157 Ah, that's a very interesting detail! Yeah you can just have one token type per level and use From impls to connect them up or something (and the methods in InterruptLock<Level, T> would take impl Into<InterruptsDisabled<Level, 'a>> instead).


Another idea that I came up with while discussing with @arielb1 on IRC:

We could only allow private static mut. So you can keep it e.g. in a module, or within a block inside a fn, const, or another static, etc. - but not export it out of there.

That, combined with deprecation of private static mut, could improve the situation.

@shepmaster
Copy link
Member

In embedded we use static mut variables with interrupt handlers. These
handlers are invoked by the hardware and non-reentrant by hardware design

IIRC, AVR actually allows for recursive interrupt handlers:

#define ISR_NOBLOCK
ISR runs with global interrupts initially enabled. The interrupt enable flag is activated by the compiler as early as possible within the ISR to ensure minimal processing delay for nested interrupts.

This may be used to create nested ISRs, however care should be taken to avoid stack overflows, or to avoid infinitely entering the ISR for those cases where the AVR hardware does not clear the respective interrupt flag before entering the ISR.

Use this attribute in the attributes parameter of the ISR macro.

It's up to the programmer to choose (and thus appropriately handle) this case.

@whitequark
Copy link
Member

IIRC, AVR actually allows for recursive interrupt handlers:

On every architecture I'm aware of, you can explicitly set the interrupt flag after entering a handler; this is not AVR-specific. (On architectures with interrupt prioritization, like Cortex-M, you'll also need to manipulate priorities.) But the point here is that this needs to be done explicitly, with unsafe code; the architecture guarantees that you wouldn't reenter the ISR if you don't specifically request that.

So this is perfectly in line with the usual safety rules.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 24, 2018

Some C APIs expose mutable globals that my Rust code right now access via:

extern "C" {
    pub static mut FOO: BAR;
}

Some C APIs require their users to define mutable globals, that the C library then accesses. Right now my Rust code interfaces with those using:

#[export(name = "foo")]
pub static mut FOO: BAR = ..initializer..;

How do I access mutable globals from C libraries, and how do I provide mutable globals to C libraries, without static mut ?

An example of a C library that uses both is jemalloc.

@alexreg
Copy link
Contributor

alexreg commented Oct 24, 2018

I don’t like this. It’s already unsafe. That’s good enough.

cblichmann added a commit to cblichmann/cloud-hypervisor that referenced this issue Sep 14, 2023
See discussion in rust-lang/rust#53639.
This came up during an internal review.
cblichmann added a commit to cblichmann/cloud-hypervisor that referenced this issue Sep 14, 2023
See discussion in rust-lang/rust#53639.
This came up during an internal review.
cblichmann added a commit to cblichmann/cloud-hypervisor that referenced this issue Sep 14, 2023
See discussion in rust-lang/rust#53639.
This came up during an internal review.

Signed-off-by: Christian Blichmann <[email protected]>
rbradford pushed a commit to cloud-hypervisor/cloud-hypervisor that referenced this issue Sep 14, 2023
See discussion in rust-lang/rust#53639.
This came up during an internal review.

Signed-off-by: Christian Blichmann <[email protected]>
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 15, 2023
const interning: decide about mutability purely based on the kind of interning, not the types we see

r? `@oli-obk` this is what I meant on Zulip. For now I left the type visitor in the code; removing it and switching to a simple interning loop will mean we accept code that we currently reject, such as this
```rust
const CONST_RAW: *const Vec<i32> = &Vec::new() as *const _;
```
I see no reason for us to reject such code, but accepting it should go through t-lang FCP, so I want to do that in a follow-up PR.

This PR does change behavior in the following situations:
1. Shared references inside `static mut` are no longer put in read-only memory. This affects for instance `static mut FOO: &i32 = &0;`. We never *promised* that this would be read-only, and `static mut` is [an anti-pattern anyway](rust-lang#53639), so I think this is fine. If you want read-only memory, write this as `static INNER: i32 = 0; static mut FOO: &i32 = &INNER;`.
2. Potentially, mutable things in a `static` are now marked read-only. That would be a problem. But I am not sure if that can happen? The code mentions `static FOO: *const AtomicUsize = &AtomicUsize::new(42)`, but that is rejected for being non-`Sync`. [This variant](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=112e930ae1b3ef285812ab404ca296fa) also gets rejected, and same for [this one](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0dac8d173a2b3099b9c2854fdad7a87c). I think we should reject all cases where a `static` introduces mutable state, except for the outermost allocation itself which can have interior mutability (and which is the one allocation where we have fully reliable type information).

What I still want to do in this PR before it is ready for review it is ensure we detect situations where `&mut` or `&UnsafeCell` points to immutable allocations. That should detect if we have any instance of case (2). That check should be part of the regular type validity check though, not part of interning.
bors added a commit to rust-lang-ci/rust that referenced this issue Oct 17, 2023
const interning: decide about mutability purely based on the kind of interning, not the types we see

r? `@oli-obk` this is what I meant on Zulip. For now I left the type visitor in the code; removing it and switching to a simple interning loop will mean we accept code that we currently reject, such as this
```rust
const CONST_RAW: *const Vec<i32> = &Vec::new() as *const _;
```
I see no reason for us to reject such code, but accepting it should go through t-lang FCP, so I want to do that in a follow-up PR.

This PR does change behavior in the following situations:
1. Shared references inside `static mut` are no longer put in read-only memory. This affects for instance `static mut FOO: &i32 = &0;`. We never *promised* that this would be read-only, and `static mut` is [an anti-pattern anyway](rust-lang#53639), so I think this is fine. If you want read-only memory, write this as `static INNER: i32 = 0; static mut FOO: &i32 = &INNER;`.
2. Potentially, mutable things in a `static` are now marked read-only. That would be a problem. But I am not sure if that can happen? The code mentions `static FOO: *const AtomicUsize = &AtomicUsize::new(42)`, but that is rejected for being non-`Sync`. [This variant](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=112e930ae1b3ef285812ab404ca296fa) also gets rejected, and same for [this one](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0dac8d173a2b3099b9c2854fdad7a87c). I think we should reject all cases where a `static` introduces mutable state, except for the outermost allocation itself which can have interior mutability (and which is the one allocation where we have fully reliable type information).

What I still want to do in this PR before it is ready for review it is ensure we detect situations where `&mut` or `&UnsafeCell` points to immutable allocations. That should detect if we have any instance of case (2). That check should be part of the regular type validity check though, not part of interning.
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 17, 2023
const interning: decide about mutability purely based on the kind of interning, not the types we see

r? `@oli-obk` this is what I meant on Zulip. For now I left the type visitor in the code; removing it and switching to a simple interning loop will mean we accept code that we currently reject, such as this
```rust
const CONST_RAW: *const Vec<i32> = &Vec::new() as *const _;
```
I see no reason for us to reject such code, but accepting it should go through t-lang FCP, so I want to do that in a follow-up PR.

This PR does change behavior in the following situations:
1. Shared references inside `static mut` are no longer put in read-only memory. This affects for instance `static mut FOO: &i32 = &0;`. We never *promised* that this would be read-only, and `static mut` is [an anti-pattern anyway](rust-lang#53639), so I think this is fine. If you want read-only memory, write this as `static INNER: i32 = 0; static mut FOO: &i32 = &INNER;`.
2. Potentially, mutable things in a `static` are now marked read-only. That would be a problem. But I am not sure if that can happen? The code mentions `static FOO: *const AtomicUsize = &AtomicUsize::new(42)`, but that is rejected for being non-`Sync`. [This variant](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=112e930ae1b3ef285812ab404ca296fa) also gets rejected, and same for [this one](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0dac8d173a2b3099b9c2854fdad7a87c). I think we should reject all cases where a `static` introduces mutable state, except for the outermost allocation itself which can have interior mutability (and which is the one allocation where we have fully reliable type information).

What I still want to do in this PR before it is ready for review it is ensure we detect situations where `&mut` or `&UnsafeCell` points to immutable allocations. That should detect if we have any instance of case (2). That check should be part of the regular type validity check though, not part of interning.
likebreath pushed a commit to likebreath/cloud-hypervisor that referenced this issue Jan 9, 2024
See discussion in rust-lang/rust#53639.
This came up during an internal review.

Signed-off-by: Christian Blichmann <[email protected]>
(cherry picked from commit 88f3537)
Signed-off-by: Bo Chen <[email protected]>
likebreath pushed a commit to cloud-hypervisor/cloud-hypervisor that referenced this issue Jan 19, 2024
See discussion in rust-lang/rust#53639.
This came up during an internal review.

Signed-off-by: Christian Blichmann <[email protected]>
(cherry picked from commit 88f3537)
Signed-off-by: Bo Chen <[email protected]>
ArhanChaudhary added a commit to ArhanChaudhary/NAND that referenced this issue Feb 21, 2024
@CAD97
Copy link
Contributor

CAD97 commented Apr 10, 2024

Interesting minor detail I wasn't aware of prior — static mut doesn't have the Sync bound that static does (because it doesn't provide safe access to &T). If you never use the capability to take &mut references to static mut, using static mut is an interesting alternative to using SyncUnsafeCell around private statics which aren't "properly" Sync and still require external cooperation around synchronization.

However, I still agree that deprecating static mut is the preferable approach.

Although, silly (probably bad) idea: instead of a separate SyncUnsafeCell type, we could privilege specifically static _: UnsafeCell<_> to not require Sync under the knowledge that SyncUnsafeCell is sound. The alleged benefit would be avoiding the duplicate type in std API and having to reach people when it's more appropriate to use one or the other. Consider also whether it'd be appropriate to have a ptr::SyncNonNull type, since pointers are in the same camp of being !Sync as a lint rather than fundamentally.

@uazu
Copy link

uazu commented Jun 2, 2024

Yes, the only concern I have about dropping static mut in my code is that I have to lie to the compiler that my types are Sync. Actually they are non-Sync but are protected by a Thread-ID check. But static won't accept it unless it is Sync. Is this a simple check, or are there further complications that may be caused by lying to the compiler about this?

(The thread-ID check has to be separate to the static containing the non-Sync types to avoid having a simultaneous &mut and & in two different threads. Code is here.)

@Skgland
Copy link
Contributor

Skgland commented Jun 2, 2024

@⁠uazu
Would something like ThreadBound<T> here work.
I believe it is correct for ThreadBound<T> to be Sync even if T isn't, as the thread boundedness is either verified or discharged via unsafe to the caller.

Idea being that ThreadBound<T> lifts the tread-bound property from a runtime property to be encoded in the type system.

@jgarvin
Copy link

jgarvin commented Jun 2, 2024

@Skgland I think the safety of that implementation depends on whether or not thread IDs can be reused, on most platforms they can. There's discussion on #67939 about whether to have stdlib guarantee it

@Skgland
Copy link
Contributor

Skgland commented Jun 2, 2024

@Skgland I think the safety of that implementation depends on whether or not thread IDs can be reused, on most platforms they can. There's discussion on #67939 about whether to have stdlib guarantee it

Excerpt from the docs of std::thread::ThreadId

ThreadIds are guaranteed not to be reused, even when a thread terminates.

For the Sync implementation to be fine it should be sufficient that for a given ThreadId only one thread has that id at any given time, i.e. reuse only happens after the thread with that id has terminated.
Though the claim about the same thread that constructed it, would need to be weakened to a thread with the same thread id as the thread that constructed it.

@uazu
Copy link

uazu commented Jun 3, 2024

Would something like ThreadBound<T> here work.

I think there may be issues with ThreadBound. If I call get_mut the same time as another thread calls get, then you have a &mut at the same time as a &, which is not allowed. That is why I had to make the THREAD_ID a separate static to the globals, so that for the check it would be a & and a & in the two threads. Also I think the OnceLock must add a cost on each access, which I'd prefer to avoid.

I'm trying to do things as efficiently as you'd do it in C, i.e. just use globals but promising not to use them from any other thread, but formalizing that in Rust so there is a genuine guarantee that all Rust's rules are kept. The cost is one small check once, and then you can keep cloning some zero-size zero-cost type, so effectively the result of that check is only known to the compiler and doesn't appear in the final executable. In the machine code it looks just like it would in C.

@bjorn3
Copy link
Member

bjorn3 commented Jun 3, 2024

If I call get_mut the same time as another thread calls get

That is already UB at the moment you create the &mut ThreadBound<T> reference, nothing to do with the get_mut impl. This is an issue for single-threaded contexts too, so putting the ThreadId in a separate static doesn't help anything. Putting a RefCell inside the ThreadBound if you want to be able to mutate the value and then exclusively using ThreadBound::get does solve this issue however at the cost of a runtime check, but a runtime check is unavoidable anyway unless you encapsulate the accesses to the ThreadBound and ensure the access methods are not reentrant.

@uazu
Copy link

uazu commented Jun 3, 2024

This is an issue for single-threaded contexts too, so putting the ThreadId in a separate static doesn't help anything.

The ThreadBound code provided in the example requires you to get a &mut before you do the check if you want mutable access. So using that API is never going to be sound. The check has to be done with & references or else you can't do the check in two threads at once. A RefCell has runtime cost, which I'd prefer to avoid.

In my code I am not using ThreadBound. I have a Once to make sure that the &mut is only created once and nothing else can have access at that time. After that point I only use & to access THREAD_ID. The THREAD_ID check appears after the Once::call_once is accessed. So this is all sound. It helps because after that check I want ZERO-COST access to the globals, permanently. No RefCell, no OnceLock, nothing. Reentrancy is avoided by the wrapper only allowing a few simple operations.

The problem is that all these globals are effectively private globals for a wrapper type (DeferrerAux). That is the type that makes everything sound. Each individual piece if viewed in isolation is not necessarily sound. So I can't put Sync on it honestly. Actually Sync is not even relevant. My question is whether adding Sync untruthfully is going to have implications on the complicated and subtle reasoning going on inside the compiler.

@bjorn3
Copy link
Member

bjorn3 commented Jun 3, 2024

How about putting an UnsafeCell in the ThreadBound? The ThreadBound ensures that only a single thread accesses it and the UnsafeCell allows getting mutable access again despite ThreadBound::get() only returning a shared reference. A lot of different types like RefCell, Cell, Mutex, ... wrap UnsafeCell to provide safe interior mutability through different means. Your use case seems to be almost like Cell, except that Cell is not flexible enough to allow pushing to the queue. You can store the &'static UnsafeCell<Globals> inside DeferrerAux to avoid duplicating the thread id check on every call site.

@Skgland
Copy link
Contributor

Skgland commented Jun 3, 2024

Keeping DeferrerAux a zero-sized type https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=17f2b339af5798c37c327af4d95efb9d is the best I could come up with, though I am not sure if its better or worse.

@uazu
Copy link

uazu commented Jun 6, 2024

Keeping DeferrerAux a zero-sized type ...

It seems an awkward solution, i.e. an UnsafeCell containing the thread-ID and another UnsafeCell, but with a backdoor that allows the check to be bypassed if the wrapper "knows" that the check has already been done. I agree that it meets the requirements, though, and looks like it won't break any rules. If the decision is made to go ahead and deprecate static mut, then that's what I'll implement. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-maybe-future-edition Something we may consider for a future edition. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such needs-rfc This change is large or controversial enough that it should have an RFC accepted before doing it. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
Status: Rejected/Not lang
Development

No branches or pull requests