-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add target_session_attrs support for PG 14 servers #1179
base: master
Are you sure you want to change the base?
Add target_session_attrs support for PG 14 servers #1179
Conversation
@@ -88,7 88,7 @@ static void init_var_lookup_from_config(const char *cf_track_extra_parameters, i | |||
|
|||
void init_var_lookup(const char *cf_track_extra_parameters) | |||
{ | |||
const char *names[] = { "DateStyle", "client_encoding", "TimeZone", "standard_conforming_strings", "application_name", NULL }; | |||
const char *names[] = { "DateStyle", "client_encoding", "TimeZone", "standard_conforming_strings", "application_name", "in_hot_standby", "default_transaction_read_only", NULL }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brian and I were discussing this offline and wondered if we need to make 2 distinct lists. One for values that are readwrite and should be set on the connection startup, and another for readonly values that we want to be able to read from the connection but not attempt to set. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I think it would probably make sense. This is also very related to #1205, because that also needs to track these read-only settings to be able to send the new versions to the client if they change on the server side.
Regarding #1205, currently that PR is sending all the ParameterStatus messages again to every client if a new server reported a different one. I think a more correct, aka more in line with postgres behaviour, would be to track these read-only fields per server, and if a client is linked to a server where the values were different than for the previous link, then we'd send ParameterStatus messages for just those ones. We'd have to be a bit careful with the perf implications of that though, because it adds overhead on every connection.
And I think default_transaction_read_only
is probably a bit special too in this regard, because it can actually be set by the client. So tracking it and then setting it according to the last client setting could actually make sense. But I'm curious how this interacts with servers configuring this setting and then changing it. If we'd set it based on the client its last value, then only new clients would get the new value. Which probably is fine, because the same would actually apply to normal connection to postgres.
@emelsimsek might have some ideas around this too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this more and reading the docs again I'm not sure that it should be a design goal here to enforce whether or not a parameter tracked by the varcache is read-only or read-write. If we set the varcache only after getting the pertinent ParameterStatus
message from the server we can be certain that it is the valid value for the back-end connection.
From the docs (emphasis mine):
ParameterStatus messages will be generated whenever the active value changes for any of the parameters the backend believes the frontend should know about. Most commonly this occurs in response to a SET SQL command executed by the frontend, and this case is effectively synchronous — but it is also possible for parameter status changes to occur because the administrator changed a configuration file and then sent the SIGHUP signal to the server. Also, if a SET command is rolled back, an appropriate ParameterStatus message will be generated to report the current effective value. 1
I think that it's none of our business if the client issues a SET
command that will fail and we can continue to transparently return whatever error the server gave the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs go on to talk about the hard-wired set of ParameterStatus
messages and warn:
Note that server_version, server_encoding and integer_datetimes are pseudo-parameters that cannot change after startup. This set might change in the future, or even become configurable.
None of those parameters are at issue here but the general vibe I get is that we should let the client deal with which parameters it can or cannot set. So, IMHO, let's leave it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that, then, separates this concern from #1205 though I think that propagating ParameterStatus
out to all connected clients may be a good change. I am not 100% sure, though, because if an administrator issues a reload
would it not already inform all connected clients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strike everything I said above about read-only parameters and target_session_attrs 🙂
The feature is broken with the default load_balance_hosts=round-robin and the default target_session_attrs=any (if you have multiple hosts in the hostlist) because of the PR's varcache usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this does need to be addressed. I have a failing test that I will add to the PR shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test demonstrates the issue at hand. I need to think more about this but wanted to share something concrete:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most straightforward way to not fail to connect to read-onlys in round-robin is to simply not attempt to SET
in_hot_standby
. 1dcdb42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a pass at making the concept of read-only vars more real. Please have a look @JelteF and @ya-mouse.
a2255a8
to
4dfc789
Compare
Needs a rebase now that #736 is merged |
816bed8
to
8f743bf
Compare
This feature would be great. Question though: does this PR provide the client with the ability to supply a target_session_attr to pgbouncer and get a matching connection back? Based on skimming the code I believe the answer is no but want to be sure. |
Hey @AndrewJackson2020 that's a very interesting question. The answer is no. In this feature, the pooler is the client in the sense of the emulation of this libpq feature. At a high level it is meant to allow the pool to be defined in a way that allows the pgbouncer operator to specify the properties that a back-end session must have to be acceptable to the pool. Just as clients that are directly connecting to PostgreSQL servers can use this setting in libpq in conjunction with multiple host-names to select the first acceptable host the pooler can do that job instead with this feature. I use this further in conjunction with Hope that helps! |
It does. Looking further into libpq frontend interfaces I don't think my idea above is even possible as target session attribute is something that is handled on the client side. I don't think the target_session_attribute value is even passed from the client to the server. |
Correct. It's a client option that affects the behavior of libpq but is not part of the session or passed to the server. |
To be able to use target_session_attribute at the client I think you'd need this PR: #1205 |
7961e96
to
bc8ef26
Compare
I see that fixes from @greg-rychlewski reveal a minor test setup issue for me on the Windows build that the previous |
bc8ef26
to
0ba7d29
Compare
@@ -88,7 88,7 @@ static void init_var_lookup_from_config(const char *cf_track_extra_parameters, i | |||
|
|||
void init_var_lookup(const char *cf_track_extra_parameters) | |||
{ | |||
const char *names[] = { "DateStyle", "client_encoding", "TimeZone", "standard_conforming_strings", "application_name", NULL }; | |||
const char *names[] = { "DateStyle", "client_encoding", "TimeZone", "standard_conforming_strings", "application_name", "in_hot_standby", "default_transaction_read_only", NULL }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I think it would probably make sense. This is also very related to #1205, because that also needs to track these read-only settings to be able to send the new versions to the client if they change on the server side.
Regarding #1205, currently that PR is sending all the ParameterStatus messages again to every client if a new server reported a different one. I think a more correct, aka more in line with postgres behaviour, would be to track these read-only fields per server, and if a client is linked to a server where the values were different than for the previous link, then we'd send ParameterStatus messages for just those ones. We'd have to be a bit careful with the perf implications of that though, because it adds overhead on every connection.
And I think default_transaction_read_only
is probably a bit special too in this regard, because it can actually be set by the client. So tracking it and then setting it according to the last client setting could actually make sense. But I'm curious how this interacts with servers configuring this setting and then changing it. If we'd set it based on the client its last value, then only new clients would get the new value. Which probably is fine, because the same would actually apply to normal connection to postgres.
@emelsimsek might have some ideas around this too.
5b7a91e
to
3160de9
Compare
8ed894d
to
77f6e73
Compare
This is a very unsatisfying fix. I found that when the IPV6 localhost postgres instance needs its config changed in certain specs to bind to ::1 that does not appear to work at all when -h is passed in the command-line. This sort of makes sense. Maybe just on "restarts" we should not specify host but this works for now and at least _appears_ intentional to me. Open to tweaking this however, though I will admit that longer-term using the IP address to distinguish "this instance" from "that" won't work at all in an IPv6 world so perhaps everything should bind to 127.0.0.1 on differing ports instead of this?
While it would be nice to provide a solid example here in the test.ini this causes valgrind spec failures in a few unrelated places as they need something closer to the default 60s client_login_timeout (such as scram auth tests).
…hat is in transaction_read_only=on
…localhost range in pg_hba.conf
d181df6
to
04a053d
Compare
Adding this member to the struct may aid in correctly handling read-only variables in pgbouncer#1205. I don't love that we need to be sure to initialize this member everywhere a var is initialized. May need to make this function public to be useful to pgbouncer#1205. I don't consider `default_transaction_read_only` to be read-only because clients CAN set it but it can also be set by the server and so maybe that's a bit special. I think users of this feature would benefit from pgbouncer#1205 updating vars as the server connection vars change.
Thanks to the portlock the Postgres writer sockets are unique in `/tmp/.s.PGSQL.nnnn`, this "namespaces" the replica sockets to `/tmp/replica/.s.PGSQL.nnnn`
This is @dpirotte's work to add support for target_session_attrs, allowing pgbouncer to only connect to a database's host if it is in a given state, such as primary or standby.
The implementation relies on specific params from the server that are only passed starting with PG 14, and therefore it only supports PG 14 or later. On older versions, target_session_attrs has no effect.
This is complementary to the proposed load_balance_hosts and #736 (comment) in change in #736
Apologies for the PR noise. This has been a challenging rebase.