RFC: Allow changing the default allocator #1183

alexcrichton · 2015-06-30T18:37:17Z

Add support to the compiler to override the default allocator, allowing a
different allocator to be used by default in Rust programs. Additionally, also
switch the default allocator for dynamic libraries and static libraries to using
the system malloc instead of jemalloc.

rendered

Add support to the compiler to override the default allocator, allowing a different allocator to be used by default in Rust programs. Additionally, also switch the default allocator for dynamic libraries and static libraries to using the system malloc instead of jemalloc.

sfackler · 2015-06-30T18:39:54Z

text/0000-swap-out-jemalloc.md

+
+```rust
+extern {
+    fn __rust_allocate(size: usize, align: usize) -> *mut u8;


Why are we using magic symbol names instead of annotation-tagged functions a la #[lang_item="foo"] or #[plugin_registrar]?

Implementation-wise, this is what everything will boil down to (pre-defined symbols), and this is currently the path of least resistance forward. This is all unstable, however, so we'll definitely be able to change it in the future to perhaps using lang items or more official attributes. The current downside of attributes are:

During a compilation, there may actually be two loaded allocators in the crate store (but we won't link one of them), so the compiler would detect duplicate lang items and yield an error. Extra logic would have to be added to "not worry about" the allocator lang items.

None of the signatures are currently typechecked, and having an official attribute makes it feel like it should be typechecked.

Basically I'd love to move to using attributes and such, but I don't see much immediate benefit over just defining some symbols in the short-term. I also don't mind adding some words to this effect in the RFC, though, and we could perhaps spec the "ideal implementation" here where the actual implementation just has some TODOs.

My ideal situation would be to have an attribute-per-function which defines the symbol, visibility, and typechecks the signature. We'd then also have a check that an #![allocator] crate contains the necessary functions (tagged with attributes). That's a good deal of attribute-surface-area to start stabilizing right off the bat though.

Ericson2314 · 2015-07-01T07:35:47Z

Are any stable interfaces proposed here? Or are we just changing the way the allocator is automatically picked as far as stable rust is concerned? I find it hard to tell.

I like the general goal, but as I said in the other Core, alloc, and log all have a need to use functionality defined elsewhere, and traits won't cut it, so it would be nice to really think through a language-level way to solve this problem once and for all (something like ML functors on the crate level, probably).

If nothing is being stabilized here, great! This is definitely a better situation than what we have currently. If interfaces are being stabilized, than I rather way for a general solution for all three crates.

Tobba · 2015-07-01T08:14:14Z

I'm pretty sure what everyone has wanted in this area for a very long time is trait-based allocator selection a la RFC #39 (which we sadly never got due to some GC-related concerns, and the GC-aware version was such an abomination everyone pretends it never happened). This would allow you to adjust the allocator for not just an entire crate, but for individual objects and in a much cleaner fashion.

nagisa · 2015-07-01T13:07:16Z

text/0000-swap-out-jemalloc.md

+allocation functions used by Rust, defined as:
+
+```rust
+extern {


Must it be C ABI?

I’d rather have something #[lang]-ish here as well.

The C ABI is not required, but leaves the door open to allowing external implementations of an allocator in the future (e.g. implementing one in C instead of Rust).

I discussed #[lang] above which may be of interest as well.

Ericson2314 · 2015-07-01T15:16:52Z

I'm more sympathetic to not stabilizing an allocators interface until we have GC, but it seems pretty harmless to implement something like #39 without stabilizing it, and just use it behind std.

alexcrichton · 2015-07-01T18:59:53Z

@Ericson2314

Are any stable interfaces proposed here?

Currently, no

@Tobba

I'm pretty sure what everyone has wanted in this area for a very long time is trait-based allocator selection a la RFC #39

I see the concept of collection-specific allocators as orthogonal to this RFC, and implementation-wise there basically must be some global symbols which represent the "allocator interface". This RFC is just connecting the dots to allow programs to switch the global allocator, not have a full-blown allocation API (hence the instability of all items proposed here)

nnethercote · 2015-07-02T10:02:39Z

From my point of view this all looks quite plausible. Thank you, @alexcrichton.

emberian · 2015-07-06T00:49:28Z

text/0000-swap-out-jemalloc.md

+
+* `alloc_system` is a crate that will be tagged with `#![allocator]` and will
+  redirect allocation requests to the system allocator.
+* `alloc_jemalloc` is another allocator crate that will bundle a static copy of


#![allocator] instead of allocator would be less confusing (I wasn't sure if it was implied that it would not have the tag)

gnzlbg · 2015-07-06T22:49:23Z

It might be worth discussing how this RFC solves or improves on the situation described in Reenix: Implementing a Unix-Like Operating System in Rust 3.3 Critical Problem: Allocation.

alexcrichton · 2015-07-07T00:07:30Z

@gnzlbg this is somewhat orthogonal in the sense that it's not stabilizing an allocator API, nor is it altering the semantics of what to do on a failed allocation. It would only help in terms of switching out which allocator is used by default.

Ericson2314 · 2015-07-07T01:56:56Z

It is possible if an allocator trait is created to only introduce the system allocator (as per this RFC) in std. That would force libcollections to be allocator agnostic :D.

gnzlbg · 2015-07-07T08:10:07Z

@gnzlbg this is somewhat orthogonal in the sense that it's not stabilizing an allocator API, nor is it altering the semantics of what to do on a failed allocation.

@alexcrichton would it be possible to modify this API to return Result or Option ?
A particular allocator can still then panic, but this might allow writing a wrapper over the system's allocator that does not panic but just returns None in case allocation failed. Of course this would be the subject of a different RFC, but it would be nice to know if this can be added without too much trouble in the future.

kornelski · 2015-07-07T13:36:58Z

I'm writing Rust libraries that I expect to be linked statically with both C and Rust programs.

Would there be a way to say "Use malloc if linked with C, and whatever Rust program wants when linked with Rust"?

i.e. my library doesn't care about which allocator is used, but doesn't want to impose any allocator on the client.

retep998 · 2015-07-07T22:08:02Z

@pornel malloc might not be the allocator that the C code is using. If your library needs to be compatible with allocations coming from an external location, then it would probably be best to provide your own API to consumers of your library to set allocator callbacks.

alexcrichton · 2015-07-07T23:29:35Z

@gnzlbg Sure it could possibly use one of those types eventually, but this RFC isn't stabilizing the signatures of these functions currently, just adding infrastructure to swap them out.

@pornel You could manually link to alloc_jemalloc or alloc_system and then toggle between the two with a --cfg, but you probably wouldn't actually need to do anything in practice. To link into C, you need to build a staticlib at some point with the Rust code at which point the system allocator will be linked in. To link into Rust you follow all the normal standard paths and get the default allocator.

kornelski · 2015-07-08T00:11:21Z

@alexcrichton Great! 👍

pnkfelix · 2015-07-08T16:58:00Z

@Tobba

trait-based allocator selection a la RFC #39 (which we sadly never got due to some GC-related concerns, and the GC-aware version was such an abomination everyone pretends it never happened)

I think that is an unfair characterization on multiple levels.

In the second RFC you are referencing (#244), the handling of GC issues certainly had problems, but feeding more type-metadata into a high-level allocator is not an inherently bad idea, IMO.

Anyway, trait-based allocator selection is a distinct issue that we are planning to address independently of this RFC.

Having a high-level / low-level split in the trait definitions may or may not be necessary, but I suspect it will be the only way to actually placate all of the parties involved.

erickt · 2015-07-08T17:22:11Z

@Tobba: I suspect you were making a joke, but please keep from describing other people's work in that way.

nikomatsakis · 2015-07-08T21:51:12Z

👍 from me. I'm still in favor of this plan. It does have this "complex" feeling -- but all the "simple" alternatives seem to have real downsides. That said, I think it is imperative that we be able to link rustc such that it and LLVM use the same allocator. I'm intrigued by your question about whether that is possible -- if it is not possible, why not? What would it take to make it work? If feels like precisely the kind of scenario other people will hit and that we are trying to make seamless, no?

I guess another way to put it is: this unresolved question suggests that there is one rather obvious case we didn't analyze as thoroughly as the others. We know that calling Rust from C makes Rust use the allocator. We know that pure Rust gets to use the builtin jemalloc. But we really want to make sure that C used by Rust will use jemalloc too! And naturally this gets into the static/dynamic linking question, and (for dynamic linking in particular) the differences betweeen platforms, right?

It feels like there ought to be some obvious precedent to follow here! Why don't other big C frameworks have this sort of problem? I guess nobody is in quite our position of wanting to simultaneously function as main and as a callee, and do the best thing in both cases?

cuviper · 2015-07-08T22:13:30Z

@nikomatsakis I expect the more general approach is to just use malloc/free, and let the executable link an unprefixed jemalloc implementation if desired, or let the user set LD_PRELOAD=libjemalloc.so.

Of course, you don't get any advanced jemalloc functionality this way, unless perhaps you create weak fallbacks for those extra functions.

alexcrichton · 2015-07-08T22:36:50Z

That said, I think it is imperative that we be able to link rustc such that it and LLVM use the same allocator. I'm intrigued by your question about whether that is possible -- if it is not possible, why not?

Ah I should clarify in that I'm not sure how to do this on all platforms. On linux I believe if we just don't prefix jemalloc then "everything should work out", but I'm less certain how to override the system allocator on OSX and Windows. I think we can coerce the system allocator on OSX to be overridden (and jemalloc may already do this), but I haven't tested any of these use cases.

But we really want to make sure that C used by Rust will use jemalloc too!

I agree! This is a very good point. I think one of the problems here is that it's a very platform-specific issue. For example on many unixes you can just use LD_PRELOAD to load in something or perhaps even just override the default allocator via malloc and free. On OSX and Windows, however, I'm less sure that it's possible to do this in a robust fashion.

Otherwise some C library provide the ability to define an allocator (e.g. via a virtual function call), but that's definitely a library-specific concern.

nikomatsakis · 2015-07-08T23:13:43Z

Right. The goal of the current design was to give us the full advantage of jemalloc when rust was in charge, and fallback to system allocator otherwise.

Niko

-------- Original message --------

From: Josh Stone [email protected]

Date:07/08/2015 18:13 (GMT-05:00)

To: rust-lang/rfcs [email protected]

Cc: Niko Matsakis [email protected]

Subject: Re: [rfcs] RFC: Allow changing the default allocator (#1183)

@nikomatsakis I expect the more general approach is to just use malloc/free, and let the executable link an unprefixed jemalloc implementation if desired, or let the user set LD_PRELOAD=libjemalloc.so.

Of course, you don't get any advanced jemalloc functionality this way, unless perhaps you create weak fallbacks for those extra functions.

—
Reply to this email directly or view it on GitHub.

pnkfelix · 2015-07-09T09:44:59Z

text/0000-swap-out-jemalloc.md

+funnel Rust allocations to the same source as the host application's allocations
+then a crate can be written and linked in.
+
+Finally, providers of allocators will simply provide a crate to do so, and then


Can you add text to this section (either in this paragraph or in a separate one) spelling out how a client who wants to provide a wrapper around Rust's default allocator (or otherwise instrument it) would do so?

This use case was alluded to, at the end of the motivation section, but I am not 100% clear on how arduous the process will be, in particular whether one will be confident that the allocator one is injecting is truly a wrapper around the allocator that Rust would have selected otherwise (that is, without the injection)

(if the answer is "It is indeed a bit arduous to write such a wrapper robustly, e.g. involving cfg switches to select properly between alloc_system and alloc_jemalloc in the alloc crate one is injecting, that is acceptable. I just want to know up front if that is the expectation.)

(its also possible that the answer involves somehow observing the values of lib_allocation_crate and exe_allocation_crate during the compilation of the crate I want to inject, and just assume they will stay the same at the time of the final link where I am being injected? Still wondering out loud; probably should just wait for @alexcrichton to answer...)

Unfortunately this RFC doesn't currently easily allow this sort of instrumentation to happen. If we wanted to support this right out of the gate, this RFC would necessitate four crates:

Two crates for implementing the allocation API, but not tagged with #![allocator]. There'd be one crate for jemalloc and one for the system.

Two crates for linking to the previous crates, but are tagged with #![allocator] and redirect the formal allocation API into the desired crate.

In a nutshell, if you want to write an allocator which can be instrumented, or shimmed then you need to write a crate which is not tagged #![allocator] but probably still exposes the allocation API via normal Rust functions. The provider of the allocator would then write their own shims that redirect to the allocator desired after the instrumentation has happened.

Does that make sense? If so I'll add some words.

hmm I missed this response back when it was written.

I guess I would have liked for some more concrete details in the RFC regarding use cases like this, i.e. spelling out what the steps are for the expected uses of this RFC, and then also including little sketches like the one in your comment for unexpected use cases.

Anyway I plan to have a shot at playing around with the PR rust-lang/rust#27400 since I am finding myself needing to do some allocation debugging. Perhaps it will inspire me to write an amendment for the RFC with such notes.

This commit is an implementation of [RFC 1183][rfc] which allows swapping out the default allocator on nightly Rust. No new stable surface area should be added as a part of this commit. [rfc]: rust-lang/rfcs#1183 Two new attributes have been added to the compiler: * `#![needs_allocator]` - this is used by liballoc (and likely only liballoc) to indicate that it requires an allocator crate to be in scope. * `#![allocator]` - this is a indicator that the crate is an allocator which can satisfy the `needs_allocator` attribute above. The ABI of the allocator crate is defined to be a set of symbols that implement the standard Rust allocation/deallocation functions. The symbols are not currently checked for exhaustiveness or typechecked. There are also a number of restrictions on these crates: * An allocator crate cannot transitively depend on a crate that is flagged as needing an allocator (e.g. allocator crates can't depend on liballoc). * There can only be one explicitly linked allocator in a final image. * If no allocator is explicitly requested one will be injected on behalf of the compiler. Binaries and Rust dylibs will use jemalloc by default where available and staticlibs/other dylibs will use the system allocator by default. Two allocators are provided by the distribution by default, `alloc_system` and `alloc_jemalloc` which operate as advertised. Closes rust-lang#27389

This commit is an implementation of [RFC 1183][rfc] which allows swapping out the default allocator on nightly Rust. No new stable surface area should be added as a part of this commit. [rfc]: rust-lang/rfcs#1183 Two new attributes have been added to the compiler: * `#![needs_allocator]` - this is used by liballoc (and likely only liballoc) to indicate that it requires an allocator crate to be in scope. * `#![allocator]` - this is a indicator that the crate is an allocator which can satisfy the `needs_allocator` attribute above. The ABI of the allocator crate is defined to be a set of symbols that implement the standard Rust allocation/deallocation functions. The symbols are not currently checked for exhaustiveness or typechecked. There are also a number of restrictions on these crates: * An allocator crate cannot transitively depend on a crate that is flagged as needing an allocator (e.g. allocator crates can't depend on liballoc). * There can only be one explicitly linked allocator in a final image. * If no allocator is explicitly requested one will be injected on behalf of the compiler. Binaries and Rust dylibs will use jemalloc by default where available and staticlibs/other dylibs will use the system allocator by default. Two allocators are provided by the distribution by default, `alloc_system` and `alloc_jemalloc` which operate as advertised. Closes #27389

froydnj · 2016-04-15T16:27:49Z

It would be splendid if the RFC described the semantics of the various __rust_* functions that an allocator crate must implement. While the functions straightforwardly map onto malloc et al, the failure modes could be quite different. For instance, does __rust_allocate panic on failure to allocate, or does it simply return a null pointer? Can any of these functions be called with a zero size? What does __rust_reallocate do around the edge cases of realloc (see this comment in Firefox, for instance)?

Some of these things can be derived from exploring the built-in crates of Rust, but it'd be much nicer for people who have to implement custom allocators to have the function semantics written down somewhere.

alexcrichton · 2016-04-15T17:08:03Z

@froydnj this RFC actually intentionally left out the specifications for each symbol (because they're all unstable), and the exact semantics/requirements may change over time (depending on how allocators shake out). So in that sense I don't believe these have been highly scrutinized in terms of solidifying what the semantics should be vs what they do now. Essentially the only "stable implementations" of a custom allocator are alloc_jemalloc and alloc_system as they're what we maintain.

You can learn more about what we currently require, however, from reading the heap.rs documentation for each wrapper function. Does that help out for now?

froydnj · 2016-04-15T17:53:52Z

@alexcrichton thanks for the explanation! It seems quite odd to introduce an interface that's stable (that's my understanding of the Rust RFC process, anyway), but then to not define interface semantics because the interface is subject to change over time. I see after a more careful reading that the RFC does call this out, though. I guess at some point these interfaces will be stabilized and then their API will be documented?

The heap.rs comments are helpful, thanks for pointing them out!

Ericson2314 · 2016-04-15T18:01:29Z

It's not stable.

sfackler · 2016-04-15T18:01:48Z

The acceptance of an RFC is only the first step on the road to stability. The implementation of an RFC will almost always land unstable, and can still change after that point, until it is formally stabilized.

alexcrichton · 2016-04-15T18:23:38Z

@froydnj yeah as mentioned by @Ericson2314 and @sfackler most of this RFC isn't actually stable. The only stable feature is that dylibs/staticlibs use the system allocator whereas executables use jemalloc. Beyond that everything is unstable and feature gated.

Now that being said, if you guys need any help about clarifications of the current implementation or find it falls short, please let me know as I'd love to help out or help tweak the design :)

froydnj · 2016-04-18T15:03:04Z

Thanks for the clarifications! I feel enlightened. :)

sfackler reviewed Jun 30, 2015
View reviewed changes

alexcrichton added T-libs-api Relevant to the library API team, which will review and decide on the RFC. T-lang Relevant to the language team, which will review and decide on the RFC. labels Jul 1, 2015

nagisa reviewed Jul 1, 2015
View reviewed changes

alexcrichton self-assigned this Jul 2, 2015

emberian reviewed Jul 6, 2015
View reviewed changes

pnkfelix reviewed Jul 9, 2015
View reviewed changes

aturon mentioned this pull request May 10, 2016

Tracking issue for alloc_system/alloc_jemalloc rust-lang/rust#33082

Closed

Centril added A-allocation Proposals relating to allocation. A-attributes Proposals relating to attributes labels Nov 23, 2018

RFC: Allow changing the default allocator #1183

RFC: Allow changing the default allocator #1183

Conversation

alexcrichton commented Jun 30, 2015 • edited by mbrubeck Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ericson2314 commented Jul 1, 2015

Tobba commented Jul 1, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ericson2314 commented Jul 1, 2015

alexcrichton commented Jul 1, 2015

nnethercote commented Jul 2, 2015

Choose a reason for hiding this comment

gnzlbg commented Jul 6, 2015

alexcrichton commented Jul 7, 2015

Ericson2314 commented Jul 7, 2015

gnzlbg commented Jul 7, 2015

kornelski commented Jul 7, 2015

retep998 commented Jul 7, 2015

alexcrichton commented Jul 7, 2015

kornelski commented Jul 8, 2015

pnkfelix commented Jul 8, 2015

erickt commented Jul 8, 2015

nikomatsakis commented Jul 8, 2015

cuviper commented Jul 8, 2015

alexcrichton commented Jul 8, 2015

nikomatsakis commented Jul 8, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

froydnj commented Apr 15, 2016

alexcrichton commented Apr 15, 2016

froydnj commented Apr 15, 2016

Ericson2314 commented Apr 15, 2016

sfackler commented Apr 15, 2016

alexcrichton commented Apr 15, 2016

froydnj commented Apr 18, 2016

alexcrichton commented Jun 30, 2015 •

edited by mbrubeck

Loading