Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clients in PgBouncer's waiting queue are not cancelled when ^C is sent via psql #1222

Open
mugunthan7285 opened this issue Dec 15, 2024 · 5 comments

Comments

@mugunthan7285
Copy link

mugunthan7285 commented Dec 15, 2024

When a client connection is in the waiting queue due to reasons such as default_pool_size limits, max_user_connections, or similar constraints, sending ^C (SIGINT) from psql does not cancel the queued connection. This behavior causes the client to remain stuck in the waiting queue until the server pool has resources to process the connection or it throws query_wait_timeout error.

Expected Behaviour
When ^C is sent in psql, the corresponding client connection should be immediately cancelled and removed from the waiting queue in PgBouncer.

Steps to Reproduce
Configure PgBouncer with a small default_pool_size or max_user_connections that can be easily exceeded.Open multiple psql connections to PgBouncer until some connections are placed in the waiting queue.While one of the connections is in the waiting queue, press ^C in the psql session.Observe that the connection is not removed from PgBouncer's waiting queue.

@AndrewJackson2020
Copy link
Contributor

I agree with this and have had similar issues in the past. I think you can terminate your shell process and that will disconnect the client from pgbouncer, even so ideally control-c should work.

That being said I'm not sure if it is possible to implement this in pgbouncer. There is a part of the code (sbuf_pause) that I believe causes pgbouncer to not listen to clients after they have been queued so control c would not reach pgbouncer until they have been queued. Hopefully someone more knowledgeable will tell me that I'm wrong about this.

@mugunthan7285
Copy link
Author

Hi @AndrewJackson2020 , Thanks for the response.

Based on my debugging, I observed that when PgBouncer is in a paused state or when clients are in the waiting queue, cancel requests still reach the accept_cancel_request method.

In this method, it checks whether the main_client (for which the cancel request is sent) is linked to a server. If the main_client is not linked, the cancel request client(req) is simply disconnected with the following comment and behavior:

/*
* The client is not linked to a server, which means that no query is
* running that can be cancelled. This likely means the query finished by
 * itself before the cancel request arived to pgbouncer.
 */
if (!main_client->link) {
	disconnect_client(req, false, "cancel request for idle client");
	return;
}

Cancel requests for waiting clients also fall into this category, as they are not linked to a server yet. It might be beneficial to handle these cases more explicitly. For instance, PgBouncer could receive the data, ignore the data, and reset the iobuf of main_client.

This could potentially improve the clarity of the behaviour and avoid confusion when cancel requests are issued for queued clients.

Looking forward to hearing your thoughts on this!

@AndrewJackson2020
Copy link
Contributor

Interesting. It definitely seems feasible then. I would definitely encourage you to submit a PR, I would be happy to test it as this issue has bothered me as well.

Keep in mind much of the logic in the function that you mentioned has to do with the scenario of multiple pgbouncers running behind a load balancer/so_reuseport and making sure the cancel request goes to the right bouncer (or something like that). Not sure if that will impact what you are trying to achieve.

@JelteF
Copy link
Member

JelteF commented Jan 13, 2025

Yeah, this seems like a sensible feature to have. Some thoughts on what flow would probably be needed to make this work:

  1. Do the currently existing checks related to peering (i.e. if we don't know the key, forward it to the correct PgBouncer peer)
  2. If the client is linked, do what we do now
  3. If the client is not linked, check if we are trying to link it:
    a. If we are not trying to link it, ignore the cancel request (it probably arrived after the previous query completed).
    b. If we are trying to link it, continue to 5
  4. Send an ErrorResponse
  5. Consume messages from the client socket until we reach a Sync/PqMsg_Query/PqMsg_FunctionCall
  6. Send a ReadyForQuery message

So basically to support this correctly we need to do the same thing that Postgres would do when receiving and handling a cancel, but without actually having to send anything to Postgres. Relevant postgres protocol docs: https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-CANCELING-REQUESTS

@mugunthan7285
Copy link
Author

I’ll take a closer look at the workflow suggested by @JelteF and work on a PR for this. Thanks @AndrewJackson2020 and @JelteF , for sharing your insights!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants