-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Allow changing the default allocator #1183
Conversation
Add support to the compiler to override the default allocator, allowing a different allocator to be used by default in Rust programs. Additionally, also switch the default allocator for dynamic libraries and static libraries to using the system malloc instead of jemalloc.
|
||
```rust | ||
extern { | ||
fn __rust_allocate(size: usize, align: usize) -> *mut u8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using magic symbol names instead of annotation-tagged functions a la #[lang_item="foo"]
or #[plugin_registrar]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation-wise, this is what everything will boil down to (pre-defined symbols), and this is currently the path of least resistance forward. This is all unstable, however, so we'll definitely be able to change it in the future to perhaps using lang items or more official attributes. The current downside of attributes are:
- During a compilation, there may actually be two loaded allocators in the crate store (but we won't link one of them), so the compiler would detect duplicate lang items and yield an error. Extra logic would have to be added to "not worry about" the allocator lang items.
- None of the signatures are currently typechecked, and having an official attribute makes it feel like it should be typechecked.
Basically I'd love to move to using attributes and such, but I don't see much immediate benefit over just defining some symbols in the short-term. I also don't mind adding some words to this effect in the RFC, though, and we could perhaps spec the "ideal implementation" here where the actual implementation just has some TODOs.
My ideal situation would be to have an attribute-per-function which defines the symbol, visibility, and typechecks the signature. We'd then also have a check that an #![allocator]
crate contains the necessary functions (tagged with attributes). That's a good deal of attribute-surface-area to start stabilizing right off the bat though.
Are any stable interfaces proposed here? Or are we just changing the way the allocator is automatically picked as far as stable rust is concerned? I find it hard to tell. I like the general goal, but as I said in the other Core, alloc, and log all have a need to use functionality defined elsewhere, and traits won't cut it, so it would be nice to really think through a language-level way to solve this problem once and for all (something like ML functors on the crate level, probably). If nothing is being stabilized here, great! This is definitely a better situation than what we have currently. If interfaces are being stabilized, than I rather way for a general solution for all three crates. |
I'm pretty sure what everyone has wanted in this area for a very long time is trait-based allocator selection a la RFC #39 (which we sadly never got due to some GC-related concerns, and the GC-aware version was such an abomination everyone pretends it never happened). This would allow you to adjust the allocator for not just an entire crate, but for individual objects and in a much cleaner fashion. |
allocation functions used by Rust, defined as: | ||
|
||
```rust | ||
extern { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must it be C ABI?
I’d rather have something #[lang]
-ish here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C ABI is not required, but leaves the door open to allowing external implementations of an allocator in the future (e.g. implementing one in C instead of Rust).
I discussed #[lang]
above which may be of interest as well.
I'm more sympathetic to not stabilizing an allocators interface until we have GC, but it seems pretty harmless to implement something like #39 without stabilizing it, and just use it behind |
Currently, no
I see the concept of collection-specific allocators as orthogonal to this RFC, and implementation-wise there basically must be some global symbols which represent the "allocator interface". This RFC is just connecting the dots to allow programs to switch the global allocator, not have a full-blown allocation API (hence the instability of all items proposed here) |
From my point of view this all looks quite plausible. Thank you, @alexcrichton. |
|
||
* `alloc_system` is a crate that will be tagged with `#![allocator]` and will | ||
redirect allocation requests to the system allocator. | ||
* `alloc_jemalloc` is another allocator crate that will bundle a static copy of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#![allocator]
instead of allocator would be less confusing (I wasn't sure if it was implied that it would not have the tag)
It might be worth discussing how this RFC solves or improves on the situation described in Reenix: Implementing a Unix-Like Operating System in Rust 3.3 Critical Problem: Allocation. |
@gnzlbg this is somewhat orthogonal in the sense that it's not stabilizing an allocator API, nor is it altering the semantics of what to do on a failed allocation. It would only help in terms of switching out which allocator is used by default. |
It is possible if an allocator trait is created to only introduce the system allocator (as per this RFC) in std. That would force libcollections to be allocator agnostic :D. |
@alexcrichton would it be possible to modify this API to return |
I'm writing Rust libraries that I expect to be linked statically with both C and Rust programs. Would there be a way to say "Use malloc if linked with C, and whatever Rust program wants when linked with Rust"? i.e. my library doesn't care about which allocator is used, but doesn't want to impose any allocator on the client. |
@pornel |
@gnzlbg Sure it could possibly use one of those types eventually, but this RFC isn't stabilizing the signatures of these functions currently, just adding infrastructure to swap them out. @pornel You could manually link to |
@alexcrichton Great! 👍 |
I think that is an unfair characterization on multiple levels. In the second RFC you are referencing (#244), the handling of GC issues certainly had problems, but feeding more type-metadata into a high-level allocator is not an inherently bad idea, IMO. Anyway, trait-based allocator selection is a distinct issue that we are planning to address independently of this RFC. Having a high-level / low-level split in the trait definitions may or may not be necessary, but I suspect it will be the only way to actually placate all of the parties involved. |
@Tobba: I suspect you were making a joke, but please keep from describing other people's work in that way. |
👍 from me. I'm still in favor of this plan. It does have this "complex" feeling -- but all the "simple" alternatives seem to have real downsides. That said, I think it is imperative that we be able to link rustc such that it and LLVM use the same allocator. I'm intrigued by your question about whether that is possible -- if it is not possible, why not? What would it take to make it work? If feels like precisely the kind of scenario other people will hit and that we are trying to make seamless, no? I guess another way to put it is: this unresolved question suggests that there is one rather obvious case we didn't analyze as thoroughly as the others. We know that calling Rust from C makes Rust use the allocator. We know that pure Rust gets to use the builtin jemalloc. But we really want to make sure that C used by Rust will use jemalloc too! And naturally this gets into the static/dynamic linking question, and (for dynamic linking in particular) the differences betweeen platforms, right? It feels like there ought to be some obvious precedent to follow here! Why don't other big C frameworks have this sort of problem? I guess nobody is in quite our position of wanting to simultaneously function as |
@nikomatsakis I expect the more general approach is to just use malloc/free, and let the executable link an unprefixed jemalloc implementation if desired, or let the user set Of course, you don't get any advanced jemalloc functionality this way, unless perhaps you create weak fallbacks for those extra functions. |
Ah I should clarify in that I'm not sure how to do this on all platforms. On linux I believe if we just don't prefix jemalloc then "everything should work out", but I'm less certain how to override the system allocator on OSX and Windows. I think we can coerce the system allocator on OSX to be overridden (and jemalloc may already do this), but I haven't tested any of these use cases.
I agree! This is a very good point. I think one of the problems here is that it's a very platform-specific issue. For example on many unixes you can just use Otherwise some C library provide the ability to define an allocator (e.g. via a virtual function call), but that's definitely a library-specific concern. |
Right. The goal of the current design was to give us the full advantage of jemalloc when rust was in charge, and fallback to system allocator otherwise. Niko -------- Original message -------- From: Josh Stone [email protected] Date:07/08/2015 18:13 (GMT-05:00) To: rust-lang/rfcs [email protected] Cc: Niko Matsakis [email protected] Subject: Re: [rfcs] RFC: Allow changing the default allocator (#1183) Of course, you don't get any advanced jemalloc functionality this way, unless perhaps you create weak fallbacks for those extra functions. — |
funnel Rust allocations to the same source as the host application's allocations | ||
then a crate can be written and linked in. | ||
|
||
Finally, providers of allocators will simply provide a crate to do so, and then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add text to this section (either in this paragraph or in a separate one) spelling out how a client who wants to provide a wrapper around Rust's default allocator (or otherwise instrument it) would do so?
This use case was alluded to, at the end of the motivation section, but I am not 100% clear on how arduous the process will be, in particular whether one will be confident that the allocator one is injecting is truly a wrapper around the allocator that Rust would have selected otherwise (that is, without the injection)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(if the answer is "It is indeed a bit arduous to write such a wrapper robustly, e.g. involving cfg
switches to select properly between alloc_system
and alloc_jemalloc
in the alloc crate one is injecting, that is acceptable. I just want to know up front if that is the expectation.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(its also possible that the answer involves somehow observing the values of lib_allocation_crate
and exe_allocation_crate
during the compilation of the crate I want to inject, and just assume they will stay the same at the time of the final link where I am being injected? Still wondering out loud; probably should just wait for @alexcrichton to answer...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this RFC doesn't currently easily allow this sort of instrumentation to happen. If we wanted to support this right out of the gate, this RFC would necessitate four crates:
- Two crates for implementing the allocation API, but not tagged with
#![allocator]
. There'd be one crate for jemalloc and one for the system. - Two crates for linking to the previous crates, but are tagged with
#![allocator]
and redirect the formal allocation API into the desired crate.
In a nutshell, if you want to write an allocator which can be instrumented, or shimmed then you need to write a crate which is not tagged #![allocator]
but probably still exposes the allocation API via normal Rust functions. The provider of the allocator would then write their own shims that redirect to the allocator desired after the instrumentation has happened.
Does that make sense? If so I'll add some words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I missed this response back when it was written.
I guess I would have liked for some more concrete details in the RFC regarding use cases like this, i.e. spelling out what the steps are for the expected uses of this RFC, and then also including little sketches like the one in your comment for unexpected use cases.
Anyway I plan to have a shot at playing around with the PR rust-lang/rust#27400 since I am finding myself needing to do some allocation debugging. Perhaps it will inspire me to write an amendment for the RFC with such notes.
This commit is an implementation of [RFC 1183][rfc] which allows swapping out the default allocator on nightly Rust. No new stable surface area should be added as a part of this commit. [rfc]: rust-lang/rfcs#1183 Two new attributes have been added to the compiler: * `#![needs_allocator]` - this is used by liballoc (and likely only liballoc) to indicate that it requires an allocator crate to be in scope. * `#![allocator]` - this is a indicator that the crate is an allocator which can satisfy the `needs_allocator` attribute above. The ABI of the allocator crate is defined to be a set of symbols that implement the standard Rust allocation/deallocation functions. The symbols are not currently checked for exhaustiveness or typechecked. There are also a number of restrictions on these crates: * An allocator crate cannot transitively depend on a crate that is flagged as needing an allocator (e.g. allocator crates can't depend on liballoc). * There can only be one explicitly linked allocator in a final image. * If no allocator is explicitly requested one will be injected on behalf of the compiler. Binaries and Rust dylibs will use jemalloc by default where available and staticlibs/other dylibs will use the system allocator by default. Two allocators are provided by the distribution by default, `alloc_system` and `alloc_jemalloc` which operate as advertised. Closes rust-lang#27389
This commit is an implementation of [RFC 1183][rfc] which allows swapping out the default allocator on nightly Rust. No new stable surface area should be added as a part of this commit. [rfc]: rust-lang/rfcs#1183 Two new attributes have been added to the compiler: * `#![needs_allocator]` - this is used by liballoc (and likely only liballoc) to indicate that it requires an allocator crate to be in scope. * `#![allocator]` - this is a indicator that the crate is an allocator which can satisfy the `needs_allocator` attribute above. The ABI of the allocator crate is defined to be a set of symbols that implement the standard Rust allocation/deallocation functions. The symbols are not currently checked for exhaustiveness or typechecked. There are also a number of restrictions on these crates: * An allocator crate cannot transitively depend on a crate that is flagged as needing an allocator (e.g. allocator crates can't depend on liballoc). * There can only be one explicitly linked allocator in a final image. * If no allocator is explicitly requested one will be injected on behalf of the compiler. Binaries and Rust dylibs will use jemalloc by default where available and staticlibs/other dylibs will use the system allocator by default. Two allocators are provided by the distribution by default, `alloc_system` and `alloc_jemalloc` which operate as advertised. Closes #27389
It would be splendid if the RFC described the semantics of the various Some of these things can be derived from exploring the built-in crates of Rust, but it'd be much nicer for people who have to implement custom allocators to have the function semantics written down somewhere. |
@froydnj this RFC actually intentionally left out the specifications for each symbol (because they're all unstable), and the exact semantics/requirements may change over time (depending on how allocators shake out). So in that sense I don't believe these have been highly scrutinized in terms of solidifying what the semantics should be vs what they do now. Essentially the only "stable implementations" of a custom allocator are You can learn more about what we currently require, however, from reading the |
@alexcrichton thanks for the explanation! It seems quite odd to introduce an interface that's stable (that's my understanding of the Rust RFC process, anyway), but then to not define interface semantics because the interface is subject to change over time. I see after a more careful reading that the RFC does call this out, though. I guess at some point these interfaces will be stabilized and then their API will be documented? The |
It's not stable. |
The acceptance of an RFC is only the first step on the road to stability. The implementation of an RFC will almost always land unstable, and can still change after that point, until it is formally stabilized. |
@froydnj yeah as mentioned by @Ericson2314 and @sfackler most of this RFC isn't actually stable. The only stable feature is that dylibs/staticlibs use the system allocator whereas executables use jemalloc. Beyond that everything is unstable and feature gated. Now that being said, if you guys need any help about clarifications of the current implementation or find it falls short, please let me know as I'd love to help out or help tweak the design :) |
Thanks for the clarifications! I feel enlightened. :) |
Add support to the compiler to override the default allocator, allowing a
different allocator to be used by default in Rust programs. Additionally, also
switch the default allocator for dynamic libraries and static libraries to using
the system malloc instead of jemalloc.
rendered