Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: implement wayland socket handover #2018

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Ferdi265
Copy link

@Ferdi265 Ferdi265 commented Jul 21, 2024

This PR implements the compositor side of the Wayland socket handover protocol as described in the KDE Wiki, but with passed via the environment variables WAYLAND_SOCKET_NAME and WAYLAND_SOCKET_FD rather than CLI options.

labwc now checks for the environment variables WAYLAND_SOCKET_NAME and WAYLAND_SOCKET_FD, which pass the name of the Wayland display socket and the Wayland display socket fd number respectively. If set, labwc will use these instead of finding a free Wayland display socket name and creating it. If either variable is unset or adding the FD fails, labwc falls back to the normal behaviour.

This allows for labwc to be used with a helper tool that automatically restarts the compositor on crashes, and allowing clients to reconnect after the compositor restart and survive the crash. The reconnecting clients look like new clients to labwc and perform all the same steps as a fresh Wayland client would. No additional handling is required on the side of labwc.

The helper tool finds and creates the Wayland socket and launches labwc with the new CLI options, restarting it when it exits with a non-zero exit code. Three such helper tools currently exist as far as I know:

This is part of a series of PRs. I've already added the same feature to Sway (unmerged) and Hyprland (merged) in order to get the crash recovery experience of Wayland into a more usable shape regardless of compositor.

This PR description was rewritten to better reflect the PR contents (click to see old PR description) This PR implements the compositor side of the Wayland socket handover protocol as described in the [KDE Wiki][KDE Wiki]. The CLI options are chosen so that they are compatible with Kwin (and Hyprland):

labwc now accepts the CLI options --socket NAME and --wayland-fd FD, which pass the name of the Wayland display socket and the Wayland display socket fd number respectively. If given, labwc will use these instead of finding a free Wayland display socket name and creating it.

This allows for labwc to be used with a helper tool that automatically restarts the compositor on crashes, and allowing clients to reconnect after the compositor restart and survive the crash. The reconnecting clients look like new clients to labwc and perform all the same steps as a fresh Wayland client would. No additional handling is required on the side of labwc.

The helper tool finds and creates the Wayland socket and launches labwc with the new CLI options, restarting it when it exits with a non-zero exit code. Two such helper tools currently exist as far as I know:

I wasn't quite sure where to add the options in the help output and man pages, so I added them at the end. I didn't want to do them alphabetically, since both options only make sense together, but would be split apart since they have different names. Another option would be to rename --socket to --wayland-socket in the help, but accept both --socket and --wayland-socket in practice.

This is part of a series of PRs. I've already added the same feature to Sway (unmerged) and Hyprland (merged) in order to get the crash recovery experience of Wayland into a more usable shape regardless of compositor.

@Consolatis
Copy link
Member

Consolatis commented Jul 21, 2024

Thanks for the PR.

I am not a big fan of those command line switches. Rather than enforcing all compositors to share some common naming for command line arguments, could we use some env vars which hold the fd number and socket name?

Edit:
That might also make it systemd socket activation compatible, although I didn't really look into how that works in detail.

@Consolatis Consolatis marked this pull request as draft July 21, 2024 20:22
@Ferdi265
Copy link
Author

Ferdi265 commented Jul 21, 2024

I am not a big fan of those command line switches. Rather than enforcing all compositors to share some common naming for command line arguments, could we use some env vars which hold the fd number and socket name?

Edit:
That might also make it systemd socket activation compatible, although I didn't really look into how that works in detail.

Thanks for the quick reply!

You're right, the command-line switches are a bit weird. There is precedent for passing FDs by command line switches (e.g. Xwayland), but a less ad-hoc mechanism would probably be better.

I like the env var idea. Do you have a suggestion for environment variable names? WAYLAND_DISPLAY_SOCKET_NAME and WAYLAND_DISPLAY_SOCKET_FD maybe?

I'm gonna look into this and see how systemd socket activation works and if that is possible to get going with Wayland.

@Ferdi265
Copy link
Author

For now I think I'll maybe add an experimental env var mechanism towl-restart that can be enabled by a CLI option (wl-restart --env) as an alternative to the cli args (--socket and --wayland-fd), but I'll see how this goes and what other compositors think about the mechanism.

@Ferdi265
Copy link
Author

I have read into systemd socket activation (docs here, C file, C header):

Systemd passes three env vars, LISTEN_PID (the pid that should receive the fds), LISTEN_FDS (the number of passed fds), and LISTEN_FDNAMES (colon-separated list of names). The file descriptors are always passed as fds 3 and following.

This looks like it might work for Wayland, except for the lock file that sits next to the socket, which I will have to look into or experiment with.

@Ferdi265
Copy link
Author

I have created a discussion issue for this in wl-restart

@Ferdi265 Ferdi265 mentioned this pull request Aug 4, 2024
@Ferdi265 Ferdi265 force-pushed the feature-socket-handover branch 2 times, most recently from 52046db to 296a149 Compare August 12, 2024 22:02
@Ferdi265
Copy link
Author

Ferdi265 commented Aug 12, 2024

I have replaced the implementation in this branch with an environment-based socket handover mechanism (implemented in wl-restart here). The details of that protocol are still not quite clear and I would love to hear feedback on the design of the mechanism or its implementation.

I have decided against the systemd socket activation mechanism as it does not seem to allow for automatically creating the lock file that is normally used with Wayland sockets, which would make starting a compositor under native systemd socket activation behave differently than starting it with wl-restart.

If this or a similar mechanism is acceptable to labwc I will try to bring forward this same mechanism to other compositors such as Sway, Hyprland, and others.

EDIT: I know the code style check failed and the documentation is incomplete, though a review on the general concept of the environment variable mechanism would still be much appreciated.

@Consolatis
Copy link
Member

Consolatis commented Aug 12, 2024

though a review on the general concept of the environment variable mechanism would still be much appreciated.

The approach looks fine to me in general.

There are some implementation specific issues like using fcntl() rather than set_cloexec() (from common/spawn.h, needs to be made public) strdup() isn't really necessary the mentioned codestyle nits. Feel free to ignore them all until a consensus is reached on the mechanism though.

@Ferdi265
Copy link
Author

though a review on the general concept of the environment variable mechanism would still be much appreciated.

The approach looks fine to me in general.

There are some implementation specific issues like using fcntl() rather than set_cloexec() (from common/spawn.h, needs to be made public) strdup() isn't really necessary the mentioned codestyle nits. Feel free to ignore them all until a consensus is reached on the mechanism though.

Thanks for the review. I will try to see what the other compositors think of the mechanism then and come back to clean this up once the overall direction is clear.

@heroin-moose
Copy link
Contributor

UNIX old fart here: perhaps it would be better to use the environment variables for this.

@Ferdi265
Copy link
Author

UNIX old fart here: perhaps it would be better to use the environment variables for this.

The new implementation of this uses environment variables. Details are here, though I want to port this to a few other compositors first and wait on their feedback to avoid having diverging implementations.

Copy link
Member

@ahesford ahesford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see nothing wrong with this in principle, although it is weird to have fatal conditions arise from unexpected conditions of environment variables. It would be better to warn if the expected conditions are not met, and then proceed to launch with the auto socket as if nothing had been defined at all.

I propose moving the environment handling out of main() and into src/server.c where it is actually used, allowing the interface of server_start() to remain unchanged. A static validator function that accepts a char** and int * can check the environment, emit warnings when both variables are not suitably defined (which, if you expand the XOR, can be more targeted and informative than the blanket "not both are correct" phrase used now), and set the passed pointers appropriately.

src/main.c Outdated
}

/* Fail if only one of wayland socket and fd are given */
if (!socket_name ^ (socket_fd == -1)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I positively abhor the abuse of bitwise operators to effect a logical Boolean operation. I'd rather see the XOR expanded using proper logical operators, but I'd even accept

(socket_name == NULL) != (socket_fd == -1)

because equality is a meaningful Boolean operation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to handle both cases being missing separately in the latest commit.

@Ferdi265
Copy link
Author

Thanks for the review @ahesford, your comments absolutely make sense to me. I'll fix them as suggested.

@Ferdi265
Copy link
Author

Updated to address the review and rebased onto latest master.

This PR now only touches server.c and all code related to socket handover now happens in a static perform_socket_handover() function, including adding the FD to the Wayland display. This makes it trivial for server_start() to fall back to the auto socket in case handover fails for whatever reason.

@Ferdi265
Copy link
Author

Ferdi265 commented Sep 27, 2024

Updated to fix failing checks and updated PR description.

@Ferdi265 Ferdi265 marked this pull request as ready for review September 27, 2024 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants