Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexible Untyped Memory Regions #1363

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Conversation

Indanz
Copy link
Contributor

@Indanz Indanz commented Dec 8, 2024

Pre-RFC proof-of-concept for Aarch64.

Support for non-power-of-two and unaligned untyped memory regions.

It's fine that seL4 delegates memory management to user space, it's not fine that it makes doing that very hard because of arcane limitations. With arbitrary untyped memory regions there is no need to subdivide non-power-of-two memory regions into multiple, smaller UTs any more. Allocating physical memory with a specific address will also be much easier now, as it needs to allocate at most one UT to get to the right address (would be even easier if GetAddress would work on UT).

Instead of keeping track of free memory with an explicit index stored in the cap, use the address of the last child object its size to find the beginning of unallocated memory. This works because newly created objects are inserted just after the UT node in the derivation list.

Now that we don't need storage to keep track of the free index, we can store the full length in bytes instead of bits.

Also lifts the limitation that caps to UTs with children can't be copied: After copying the original cap will look "full" and only the derived cap can keep allocating further.

This doesn't actually change the user space API yet. Untyped_Retype has been prepared to use the size argument as a byte value, which can easily be made backward compatible with a simple wrapper. But for a real API change the bootinfo also needs to change to pass UTs regions with byte sizes instead of bit sizes, and that would break sel4test, which I didn't want to do yet.

Because UTs now act like a stack and can reallocate over previously deleted objects, it's not possible to only zero memory at UT reset time, so to limit WCET zeroing an alternative solution needed to be found, which I implemented for frames and CNodes (SCs can be ridiculously big too in theory). The current code probably needs to be made more generic so that there is less code duplication when adding more architectures.

Clearing memory just before use should be more cache friendly than zeroing it at UT reset time, and also fairer in CPU time accounting. A big problem of the current code is that the cache flushes do happen in one go without preemption. In practise that means the WCET is equal to the time it takes to more or less flush the whole last level cache. Another problem with zeroing memory at UT reset time is that if UT are allocated, the same memory gets zeroed multiple times unnecessarily.

The UT changes passed sel4test before I added the zeroing changes, I haven't tested those yet.

As this is a major API break, an RFC is needed, but before proceeding with that I wanted to share my implementation and gather verification impact feedback. Considering how easy the code changes were, I suspect verification won't be that hard either, but it will cost time. Except if there is some deep verification reliance on UTs being power-of-two in size anywhere in the proofs, then this won't work.

@Indanz Indanz added enhancement verification Needs formal verification input/change, or is motivated by verification labels Dec 8, 2024
@Indanz
Copy link
Contributor Author

Indanz commented Dec 8, 2024

Relevant discussion about memory zeroing at #1289.

Support for non-power-of-two and unaligned untyped memory regions.

Instead of keeping track of free memory with an explicit index stored in
the cap, use the address of the last child object   its size to find the
beginning of unallocated memory. This works because newly created objects
are inserted just after the UT node in the derivation list.

Now that we don't need storage to keep track of the free index, we can
store the full length in bytes instead of bits.

Also lifts the limitation that caps to UTs with children can't be copied:
After copying the original cap will look "full" and only the derived cap
can keep allocating further.

Proof-of-concept which doesn't change the user space API yet.
For that bootinfo also needs to change.

Signed-off-by: Indan Zupancic <[email protected]>
Zero memory before use instead of at UT reset time. This is fairer
because the actual user pays the CPU cost and not whoever happens
to be the first to allocate from an UT.

All memory must have been zeroed before the CNode can be used.
However, as zeroing state is kept in the cap, we also need to
guarantee that all memory is zeroed before the cap gets copied.
This is done by checking for pending zeroing in deriveCap().

Signed-off-by: Indan Zupancic <[email protected]>
Proof-of-concept Aarch64 implementation.

Zero memory before use instead of at UT reset time. This is fairer
because the actual user pays the CPU cost and not whoever happens
to be the first to allocate from an UT.

All memory must have been zeroed before the page can be mapped.
However, as zeroing state is kept in the cap, we also need to
guarantee that all memory is zeroed before the cap gets copied.
This is done by checking for pending zeroing in Arch_deriveCap().

The old code zeroed memory in CONFIG_RESET_CHUNK_BITS many times
before checking for preemption.

CONFIG_MAX_NUM_WORK_UNITS_PER_PREEMPTION << CONFIG_RESET_CHUNK_BITS
amounts to 25 kB. Now we zero 16 kB before checking for preemption,
which should both give much better performance and lower WCET.

Signed-off-by: Indan Zupancic <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement verification Needs formal verification input/change, or is motivated by verification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant