move malloc from Solo5 to libOS layer #223

djwillia · 2017-12-07T14:50:52Z

Solo5 currently contains a malloc implementation. In the MirageOS case, it is used by the libOS via ocaml-freestanding. In the IncludeOS case, they have their own malloc, which gets memory from the Solo5 malloc.

I think malloc should be the responsibility of the libOS layer. This has the advantage of removing duplicate functionality (in the IncludeOS case). Also, it removes the malloc family of calls from the solo5 interface (solo5.h) and bringing it closer to alignment with the underlying isolation interface (e.g., the set of ukvm hypercalls).

I've prepared a PR for Solo5 and for ocaml-freestanding, which applies to the MirageOS case. @ricarkol I would like your opinion on whether this will be suitable for IncludeOS as well.

ricarkol · 2017-12-07T16:09:08Z

Thanks for the change @djwillia . It makes things easier on the IncludeOS side. We are currently using the IncludeOS malloc() which uses all of the solo5 heap.

mato · 2017-12-08T14:31:54Z

@ricarkol Can you point me to the relevant code in the IncludeOS repository that initialises the heap when running on Solo5?

ricarkol · 2017-12-08T14:44:09Z

@mato here it is:

https://github.com/hioa-cs/IncludeOS/blob/master/src/platform/x86_solo5/kernel_start.cpp#L72

mem_size = (uintptr_t)get_cpu_ebp();
...
kernel_start()
    OS::start(cmdline, mem_size)
        OS::memory_end_ = mem_size;
        OS::heap_max_ = OS::memory_end_;
        // OS::heap_begin is really reinterpret_cast<uintptr_t>(&_end);

ricarkol · 2017-12-08T14:59:12Z

As you can see, using ebp as the heap_end won't work on the seccomp case.

mato · 2018-02-14T15:36:47Z

I've finally found some time to think about the implications of this change in depth, both from a design and implementation (WIP in #224, mirage/ocaml-solo5#29) point of view.

To set the stage: I agree that a high-level memory allocator (as defined by the malloc() family of functions) need not be part of the Solo5 API, and we gain the following benefits by removing it:

the application is free to manage memory as it sees fit in a way that is tailored to its requirements. This will be an advantage especially in the case of managed language runtimes,
the Solo5 API and implementation both shrink in size, which is good.

However: I think that Solo5 should retain some control over the low-level memory layout and allocation semantics (i.e. how malloc() gets its memory at runtime). Rationale:

we should keep the ability to implement a stack guard, and abstract away the low-level details of the stack/heap split from the application,
we should preserve the ability to hand back memory freed by the application to the host, even if we don't do that today,
if we end up implementing ASLR, or doing other tricks with page tables in Solo5 or a monitor/tender (see Renaming ukvm to a more generic name #172), we will need to keep control over the memory layout,
the same applies if we wish to implement shared memory communication channels, hardware device pass-through or other techniques, where the low-level abstractions to set this up need to be done at the Solo5 API layer.

To summarize: By all means, let's remove malloc(), but let's keep the "cut point" at the low-level runtime interface for dynamic memory allocation, which should stay in Solo5. If we don't do that then we will shoot ourselves in the foot and be sorry later.

Now, the question is, what should this low-level interface look like? I can think of two options:

something resembling UNIX mmap() or Win32 VirtualAlloc(),
something resembling UNIX sbrk().

In the case of (1), given that a) we don't directly expose the notion of "virtual memory" to the application and don't plan to do so in the future, b) we are presenting the abstract concept of a single heap to the application, an mmap()-style interface seems far too over-engineered and would include a bunch of complex housekeeping to deal with multiple memory regions in Solo5.

That leaves the 1970s-style sbrk(), or a more modern version of the same concept, i.e. "extend the heap" and "shrink the heap".

The main limitation of a sbrk() style interface is that it requires the heap to be contiguous in memory. I don't think this is an issue for us since the current direction with Solo5 is to perform all external resource allocation (which may include mapping "other" memory regions) before passing control to the guest, leaving it with an entirely static system.

Unless anyone has strong objections to this, I'm going to re-work #224 and mirage/ocaml-solo5#29 along these lines and see what I get. /cc @djwillia @ricarkol

djwillia · 2018-02-14T19:02:00Z

I think we are in agreement that Solo5 should retain some control over
the low-level memory layout. But I believe we already have control to
do most of your points, without introducing an additional allocation
interface to the Solo5 API.

The only of your points that seems to need an allocation routine seem
like they are associated with dynamic behavior, mostly your second
point: handing back memory to the host.

Up to this point, we've been doing most resource allocation tasks
statically (e.g., a static amount of memory for the unikernel). I
think this is a simple way to do it that I'm not sure we should change
unless we have an application need to do so.

Do you have any unikernels in mind that use a malloc that tries to
give memory back to the host or that a static allocation is
insufficient for?

mato · 2018-02-15T12:08:40Z

The only of your points that seems to need an allocation routine seem like they are associated with dynamic behavior, mostly your second point: handing back memory to the host.

There's also my first point, which is enforcing a stack/heap split. Granted, this is mainly a convenience feature, but if we want to implement proper stack guards (to detect stack overflow correctly at runtime) then we can't just hand off "all the memory" to the libos. It might be sufficient to resolve this by returning the heap boundaries in a solo5_mem_info() call instead of "all the memory".

Up to this point, we've been doing most resource allocation tasks statically (e.g., a static amount of memory for the unikernel). I think this is a simple way to do it that I'm not sure we should change unless we have an application need to do so.

I'm more concerned about the real-world operational aspects in the long run. While static memory allocation works, we've been glossing over the fact that (excepting platforms which themselves use static allocation such as virtio or muen) under the hood the guest memory pages have to come from somewhere. Memory is still a scarce resource, especially if you look at scenarios where you want to run lots of unikernels at high density, potentially overcommitting your host resources.

With the "hands off" approach of "here's a chunk of memory, go use any/all/none of it", we are effectively committing ourselves to pretending that memory is cheap, and ignoring any memory pressure or hidden costs (page tables, kernel memory) on the host.

With sbrk() we have at least some control point where we keep the option of changing our memory allocation strategy under the hood in the future without breaking applications.

Do you have any unikernels in mind that use a malloc that tries to give memory back to the host or that a static allocation is insufficient for?

Most (all?) modern memory allocators will try to give back memory to the host (or "next level below") at some level of granularity. Even dlmalloc (and thus Mirage/Solo5) does so. We're just ignoring it and not communicating that back to the host. Granted, in the vt (ukvm) case we probably don't want to do that as it has other implications for the implementation, but again, I'm not sure, and why rob ourselves of the option to do so in the future?

ricarkol · 2018-02-15T15:01:40Z

Could we have "dynamic memory allocation" implemented in an optional module? What about the following. A unikernel could start with a static heap size, and if the dynamic memory allocation module is present, it could ask for some pages (not necessarily sequential, and not necessarily after the original static heap). And specifically for this PR, we could start by having a solo5_mem_info() call.

djwillia · 2018-02-15T15:55:41Z

There's also my first point, which is enforcing a stack/heap split.

I think stack/heap splits is a decision that the (upper layer) libOS
will want to make; for example, it may be allocating many stacks in
what Solo5 thinks of as the heap for a threading implementation. I
think that would push towards an mprotect-like interface for stack
guards, but even then I would favor a static approach where the libOS
specifies a memory protection map at boot time. It may know, for
example, that it wants to implement N stack guards in its threading
strategy.

especially if you look at scenarios where you want to run lots of
unikernels at high density, potentially overcommitting your host
resources.

I think that overcommitting host memory resources is often a bad idea
because it is so expensive if you get it wrong and it often adds a lot
of complexity into the system. I agree that we can't completely
ignore dynamic behavior forever, but I'm just a bit wary of adding in
that stuff unless we really have to - I believe there are use cases
that benefit from the lack of complexity that comes from doing things
statically.

With sbrk() we have at least some control point where we keep the
option of changing our memory allocation strategy under the hood in
the future without breaking applications.

I'm not sure I understand; if we leave out sbrk and add it in later,
no applications will break, but if we add in sbrk and decide it is
not what we want later, then things that may depend on it will break,
right?

Is there a way to do what you need in an optional way, e.g., as a
module, as @ricarkol suggests?

mato · 2018-02-15T18:33:29Z

I think stack/heap splits is a decision that the (upper layer) libOS will want to make; for example, it may be allocating many stacks in what Solo5 thinks of as the heap for a threading implementation. I
think that would push towards an mprotect-like interface for stack guards,

An mprotect-like interface might work for this, though it'd still be up to the libOS to ensure that it lays out whatever regions it's trying to protect with enough granularity to support any underlying architectural limits (e.g. huge pages). Also, if such an interface existed I'd be keen on it only ever being able to increase protection, eg. RW -> R -> (none) or RX -> X -> (none) or (if possible and allowed via a separate policy) (R)W -> (R)X -> (none).

I'm not sure I understand; if we leave out sbrk and add it in later, no applications will break, but if we add in sbrk and decide it is not what we want later, then things that may depend on it will break, right?

This is not specifically about sbrk (in fact that specific interface from the 1970s is pretty horrible). But taking "some interface to grow/expand a single heap" as an example, we can't have both that and the hands off approach of "here is all your memory" since they are mutually incompatible.

Is there a way to do what you need in an optional way, e.g., as a module, as @ricarkol suggests?

You misunderstand me -- I don't need anything. The current arrangement with malloc works just fine for Mirage/Solo5, so from that point of view this discussion is largely academic as it doesn't change "what can you do with today's code after this change that you couldn't do before".

However, given that we are changing things I'm trying to understand the implications for the wider "contract" (formalised or not) between Solo5 and the libOS, and whether or not any of them get us into a situation where "we can't do something with tomorrow's code after this change that we could do before".

Now, I think we've covered most of the points and you've managed to convince me that the "hand's off" approach is the best we have at the moment. Let me try and document the new contract:

The guest receives the (start, size) of a single, contiguous, read/write memory region it may use for heap or stack(s) as it sees fit.
At entry, the guest is provided with an initial stack growing down from start size.
The guest must not make any further assumptions about memory layout, including where executable code or static data are located in memory.

With that, I think we should dispense with the proposed solo5_mem_info() call entirely and instead change the signature of solo5_app_main() to:

int solo5_app_main(const char *cmdline, uintptr_t heap_start, size_t heap_size);

...since any non-trivial guest will need this information at start time. Note that unlike the proposed solo5_mem_info() the heap_size passed here is the size of the of the "data region" only.

For the Mirage/Solo5 and ocaml-freestanding split I can adapt to this scheme by adding a nolibc_init(heap_start, heap_size) function which will be called by the mirage-solo5 package before handing off control to the OCaml runtime via caml_startup().

What do you think? Does this make sense?

Clean up the API post malloc removal (see discussion in Solo5#223). - Introduce a struct solo5_boot_info, pass a const pointer to a statically allocated copy of it to solo5_app_main(). - Move cmdline to this struct, make it const. - Use arch-independent type uintptr_t for heap_start. - mem_lock_heap() now returns the values to be used for heap_start, heap_size when setting up solo5_boot_info in _start(). - Remove remains of stack_guard_size from mem.c, this is to be handled by the application. - Remove solo5_get_info() / solo5_mem_info() entirely as it's now redundant. Additionally, while cleaning up these code paths make it clear that the semantics of returning from solo5_app_main() are equivalent to calling solo5_exit().

mato · 2018-02-20T12:46:09Z

@djwillia I now have the Solo5 side of the above ready in #244. This is based on your work with the additional changes discussed here and rebased to current master.

I've also done one more thing, which is changing solo5_app_main() to take a const struct solo5_boot_info * instead of individual parameters; this will allow some ABI flexibility for future extension points without changing the signature, plus it means you don't have to specify tons of __attribute__((unused)) for parameters you don't care about.

mato · 2018-02-20T13:54:21Z

Corresponding ocaml-freestanding changes are here: mirage/ocaml-solo5@master...mato:add-malloc

And the minimal mirage-solo5 changes: mirage/mirage-solo5@master...mato:malloc-change

I'll open PRs for these in the right order (for CI purposes) once this change lands in Solo5.

mato · 2018-02-20T17:09:32Z

Done in mirage/ocaml-solo5#32, mirage/mirage-solo5#26. This is now completed, yay!

This was referenced Dec 7, 2017

remove malloc from solo5 #224

Closed

add dlmalloc to ocaml-freestanding (formerly in solo5) mirage/ocaml-solo5#29

Closed

mato mentioned this issue Feb 20, 2018

Remove malloc from Solo5 #244

Merged

mato closed this as completed Feb 20, 2018

mato mentioned this issue Feb 23, 2018

Solo5 public API refresh, take 2 #245

Merged

hannesm mentioned this issue Mar 16, 2019

SSP and the Solo5 ABI (was Multiple definition error of __stack_chk_fail ...) #335

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move malloc from Solo5 to libOS layer #223

move malloc from Solo5 to libOS layer #223

djwillia commented Dec 7, 2017

ricarkol commented Dec 7, 2017

mato commented Dec 8, 2017

ricarkol commented Dec 8, 2017

ricarkol commented Dec 8, 2017 •

edited

Loading

mato commented Feb 14, 2018

djwillia commented Feb 14, 2018

mato commented Feb 15, 2018

ricarkol commented Feb 15, 2018

djwillia commented Feb 15, 2018

mato commented Feb 15, 2018

mato commented Feb 20, 2018

mato commented Feb 20, 2018

mato commented Feb 20, 2018

move malloc from Solo5 to libOS layer #223

move malloc from Solo5 to libOS layer #223

Comments

djwillia commented Dec 7, 2017

ricarkol commented Dec 7, 2017

mato commented Dec 8, 2017

ricarkol commented Dec 8, 2017

ricarkol commented Dec 8, 2017 • edited Loading

mato commented Feb 14, 2018

djwillia commented Feb 14, 2018

mato commented Feb 15, 2018

ricarkol commented Feb 15, 2018

djwillia commented Feb 15, 2018

mato commented Feb 15, 2018

mato commented Feb 20, 2018

mato commented Feb 20, 2018

mato commented Feb 20, 2018

ricarkol commented Dec 8, 2017 •

edited

Loading