-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock after SIHCHLD signal handling #3004
Comments
I've also noticed that I don't get the same error if I have only 1 worker or if I add a sleep statement in the |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Environment
Python 3.9
Gunicorn 20.1.0
Current state
I have an application running with Gunicorn with 5
gthread
workers.I'm using a framework that in turn calls the
Arbiter.run
function. Before calling that function I start a separate thread which will sleep for a period of time and perform some logic before signalling the process (os.getpid()
) with SIGHUP.I implemented this in two different ways (both have the same issue which I'm describing in a section below).
while True
loop.I'm also using the
child_exit
andworker_exit
server hooks which contain a log statementin each function.
Behavior
We've had this issue in 2 different scenarios.
max-requests
andmax-requests-jitter
configurations - After some successful autorestarts, at a random point in time, workers die after serving the max number of requests and no new workers are booted up.What I've noticed is that there is some weird behavior going on with the
reap_workers
function whenever a SIGCHLD needs to be handled.In that function we're looping over dead child process ids and later on
cfg.child_exit
is called which means my log statement should be printed each time. What I find weirdly suspicious is that each time the reload stops happening, the previous reload is not printing my log statement incfg.child_exit
for each dead child process id (for some process ids that log is missing). However I see the log statement incfg.worker_exit
which is called everytime (for all process ids) without failure, which indicates that that process indeed terminated.Steps to reproduce
Arbiter.run
function.You need to preload the app.
The text was updated successfully, but these errors were encountered: