Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: global t8_finalize for freeing all allocated t8code objects in one go. #1295

Open
jmark opened this issue Nov 6, 2024 · 8 comments
Labels
discussion New feature Adds a new feature to the code question

Comments

@jmark
Copy link
Collaborator

jmark commented Nov 6, 2024

Feature request

Is your feature request related to a problem? Please describe.

It would be helpful if t8code provides a global t8_finalize routine which frees all allocated objects by t8code in a proper and clean manner. Note, this is in contrast to sc_finalize which only does a mere check if all allocated objects are freed and prints a warning and/or aborts the program.

What is the problem there?

A strong use case is the interoperability with the programming language Julia respectively Trixi.jl and especially MPI.jl.

Julia's garbage collector finalizes objects non-deterministically. That means that MPI usually gets finalized before t8code related objects get finalized when Trixi.jl shuts down. This leads to nasty crashes/segfaults since t8code allocates MPI related objects, e.g. shared memory arrays.

Describe the solution or feature you'd like

There is a MPI.add_finalize_hook!() for exactly such scenarios described above. It would be very useful to have a t8_finalize routine which could be called by this hook.

In order to have such a feature, t8code needs a proper allocation/deallocation tracking (like a managed memory pool). When designing the C code base maybe the C runtime already provides such a feature.

Describe alternatives you've considered

As of now, t8code related objects in Trixi.jl are finalized explicitly before shutting down. This, however, is not how Julia is supposed to be used.

Estimated priority

"Priority: low" Should be solved eventually

I see an increasing demand for such a feature with growing user base and when coupling t8code with more and more languages and frameworks.

Additional context

Trixi.jl issues a warning here when there a still un-freed t8code objects when shutting down: https://github.com/trixi-framework/Trixi.jl/blob/91eaaf68e95cdba8062a1e607172c8505a0a2503/src/auxiliary/t8code.jl#L35

Here is an example how Trixi.jl finalized t8code objects explicitly: https://github.com/trixi-framework/Trixi.jl/blob/91eaaf68e95cdba8062a1e607172c8505a0a2503/examples/t8code_3d_dgsem/elixir_euler_ec.jl#L93

@jmark jmark added question New feature Adds a new feature to the code discussion labels Nov 6, 2024
@Davknapp
Copy link
Collaborator

Davknapp commented Nov 8, 2024

Making use of smart-pointers in the future would very probably solve your problem. But it would require to use smart-pointers for every ressource that t8code is using, which might not always be the best choice.
Maybe it is possible to create such a function only for the interface? I think most of the t8code-objects such as forest, cmesh, etc. provide a t8_*object_name*_destroy function. Is there an option in julia to somehow keep track of allocated resources and then call the destroy-functions for those?

@jmark
Copy link
Collaborator Author

jmark commented Nov 8, 2024

I think most of the t8code-objects such as forest, cmesh, etc. provide a t8_*object_name*_destroy function. Is there an option in julia to somehow keep track of allocated resources and then call the destroy-functions for those?

In general, this is possible and already thought of such a solution. However, such an approach would interfere with the lifetime of such objects - basically keeping them alive till the application shuts down. This is not desirable since Julia's garbage collector should be able to destroy t8code mesh/forest objects at its own discretion.

@jmark
Copy link
Collaborator Author

jmark commented Nov 8, 2024

I think most of the t8code-objects such as forest, cmesh, etc. provide a t8_object_name_destroy function.

Exactly these destroy functions are called in the finalizers on Julia side. However, as already pointed out, the decision when these finalizers are called, is up to the garbage collector.

@jmark
Copy link
Collaborator Author

jmark commented Nov 11, 2024

We just had a lengthy discussion about this in the t8code developer's meeting.

Globally tracking memory allocations throughout the whole t8code code base would be a laborious, error-prone task with potentially accompanying performance degradation.

Fortunately, our suspicion is that just keeping track of allocated MPI shared memory covers a lot of use cases already. Providing a global clean-up routine for this is a feasible task.

@cburstedde
Copy link
Collaborator

cburstedde commented Nov 11, 2024 via email

@dutkalex
Copy link
Contributor

dutkalex commented Nov 12, 2024

Hi guys! This sounds like a tricky-to-get-right feature @jmark, and I really don't know if it is possible to ship something that will make everyone happy...
In my view, the root problem here is that C, (modern) C , and Julia rely on 3 different approaches to ressource management:

  • C mandates destruction/deallocation functions which must be called by the user. This is a static and explicit approach
  • C is technically built on the same principles, but idiomatic C introduces RAII wrapper classes to abstract these concerns, so that destruction is done automatically based on scope. This is a static but implicit approach, which can easily coexist with the C approach, except maybe in a few pathological cases. However, supporting both APIs imply strict restrictions on the lifetime semantics that can actually be implemented.
  • Julia (like Python and other interpreted/JIT languages) is a garbage-collected language. It does not provide any guarantee on when it deallocates stuff, other than "when it is not needed anymore". This approach is purely dynamic, and is in stark contrast with the C and C approaches.

I don't claim to know the right way to make this work, but here are some of the lessons I have learned (sometimes the hard way) when designing Python APIs for the C code I work with (which uses t8code and indirectly has to answer the same kind of questions):

  • Embrassing RAII, and more generally speaking value semantics, goes a long way towards taming the garbage-collector. The ideal case is to have very clear whole-part relationships between all the components available through the API, so that all the destruction/deallocation logic can be handled deterministically in the C/C layer and without user intervention, because the guarantees are much stronger there. The goal being that, no matter the order in which the garbage-collector tries to delete the different objects, each object is effectively destroyed in the right order without loose ends or circular ownership patterns left in the end. In other words, when there is a coupling between multiple components (deallocation-wise), the coupling must be addressed in C/C , and one of the components should be responsible for deallocating both itself and the other (in the right order), while the second should behave like a non-owning handle. I would expect that making this happen would require a lot of refactoring and redesigning work in t8code, and would lead to lots of breaking changes, because the current semantics are quite far from that.
  • A functional design based on immutable objects is also very helpful for such purposes. In some cases this is neither easy to implement nor intuitive to use, and there are tradeoffs to be made. I believe however that this is already well-aligned with the current design of t8code in the sense that, if you want to adapt a forest for example, you effectively construct a new forest (at least logically speaking, I reckon that there might be some COW implemented under the hood to speed things up).

The point I'm trying to articulate here is that this issue should be considered in the broader discussion of
t8code's migration towards more modern C , because whether the migration is meant to be a small surface refactoring or will include semantic revisions and redesigns, will greatly influence how to proceed here. For example, if the C API is to be supported in the long run, as a lower-level alternative to the main C API, this effectively retrains a lot what can be done to address this issue within t8code. I would expect that a pure C wrapper of t8code would be needed on the Trixi side to address the impedance mismatch in that case.

@cburstedde
Copy link
Collaborator

cburstedde commented Nov 18, 2024 via email

@dutkalex
Copy link
Contributor

dutkalex commented Nov 18, 2024

@cburstedde I would say that yes, since what you suggest effectively boils down - if I understand your comment correctly - to supporting two disjoint APIs:

  • the original C-style API where the user is responsible for calling the cleanup code
  • a higher-lever API which automates the cleanup

If you can guarantee that the two are never mixed together, which I believe is achievable with a good design, then I don't see a reason why this approach wouldn't work. This strategy can effectively be described as having the wrapper be part of the t8code repo. In C , building such an abstraction layer on top of the existing t8code API is not very difficult since (except for a few functions) it is quite straightforward to wrap the current API into a handful of RAII types. Indeed, since all the objects are already reference counted, enforcing correct deallocation order is really just a matter of making sure that each objects holds a reference to its direct parent: elements keep their forest alive, forests keep their cmesh alive, and cmeshes keep the global module handle alive basically. If you wish to keep a pure C codebase however, I really don't know how feasible this is...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion New feature Adds a new feature to the code question
Projects
None yet
Development

No branches or pull requests

4 participants