Page MenuHomePhabricator

Allow integrating requestctl rules into haproxy
Closed, ResolvedPublic

Description

We want to be able to inject rules from requestctl at the tls termination layer. These rules would be both throttling rules and bandwidth limiting ones, at the very least. This means we might need to extend the action schema, or create alternative action schemas.

In order to do this, we need requestctl to have a haproxy translation layer, which in theory should be easy to do given requestctl is modular. However, the capabilities we have in haproxy aren't the ones we have in varnish, and thus it might be useful to introduce the concept of capabilities of the translation layers, or alternatively to create separate action schemas for each software we want to support - in fact haproxy is just the first of the list.

Enumerating things we'll need to do to complete this work:

  • Identify where we want to inject rules from requestctl in haproxy
  • Modify requestctl so it can support output for haproxy
  • Do the plumbing to inject data via confd/etcd in haproxy, including syntax checks

Event Timeline

Vgutierrez subscribed.

please note that we won't have visibility via X-Requestctl of traffic impacted by these rules in the TLS termination layer till T351117 is completed

Joe triaged this task as High priority.Jul 11 2024, 9:08 AM

Changing the priority to high as we had yet other instances where being able to limit bandwidth on uploads would've helped us.

Haproxy has a logic that's very different from varnish, but it should be possible to translate most of our current patterns or ipblocks to something haproxy can read.

Specifically, the translation will need to have an acl definition, and then an action on the http request. So say we have the following rule in a requestctl action:

pattern@ua/googlebot AND (ipblock@cloud/aws OR ipblock@cloud/google)

this will roughly translate to

acl  pattern_ua_googlebot  req.hdr(user-agent) path_reg Googlebot.*
acl ipblock_cloud_aws src ...
acl ipblock_cloud_google

http-request silent-drop if pattern_ua_googlebot && ( ipblock_cloud_aws ||  ipblock_cloud_google)

So we would have to do things a bit differently than we've done it for varnish - ideally we will have the translator layer collect all ipblocks and acls used and add those to the final output once we're done evaluating all rules. This has some nuance to it - how will we be able to separate the haproxy config snippet by site?

There is also the factor that actions on haproxy are different than on varnish, so we might decide to add a new type of action subclass that allows that kind of actions (silent drop, bandwith limiting, etc).

There is also the argument that in reality it's just an extension of potential actions to take and we should just modify the schema of the Action objects.

As @Fabfur made me notice, conditions can also be expressed inline:

http-request silent-drop if (req.hdr(user-agent) path_reg Googlebot.* ) && ...

and ipblocks will be transformed to haproxy acls like we do in varnish, something like

acl ipblock_cloud_google src -f /ipblocks/cloud_google

(which has the disadvantage we'll need to manually define the confd files for every ipblock we want to use, which isn't ideal, but we might be able to work around it).

and ipblocks will be transformed to haproxy acls like we do in varnish, something like

acl ipblock_cloud_google src -f /ipblocks/cloud_google

(which has the disadvantage we'll need to manually define the confd files for every ipblock we want to use, which isn't ideal, but we might be able to work around it).

We might want to use the "map format" supported by haproxy (basically key/value pairs), with the CIDR as the key and the cloud provider name ('google' etc) as the value.

The map will be large, but that should be OK, and this prevents needing to regularly define new confd templates.

https://docs.haproxy.org/2.8/configuration.html#7.3.1-map
https://www.haproxy.com/blog/introduction-to-haproxy-maps

There's an interesting problem to manage with haproxy, which is making me think we should support a much simplified syntax.

Let's say the user wants to disable all traffic that:

  • Comes from the cloud nightmarehosts
  • Has either user agent starting with python-requests or not contianing Wikidata

The obvious way to express this in requestctl's expression language would be something like:

ipblock@cloud/nightmarehosts AND (pattern@ua/requests OR NOT pattern@ua/wikidata)

Sadly, there seem to be no way to support precedence between logical operators in haproxy, and per the documentation, AND always takes precedence.

So the obvious way of defining this, which would be something like:

http-request deny if { src -f /ipblock/cloud/nightmarehosts } { { hdr_reg(User-Agent) -i "^python-requests" } || ! { hdr_reg(User-Agent) -i "Wikidata" } }

doesn't work:

[ALERT] (1210) : config : parsing [haproxy.cfg:38] : error detected while parsing an 'http-request deny' condition : missing fetch method in ACL expression '{'.

because, contrary to what the documentation suggests, it's not possible to aggregate logical expressions in conditions.

The best way to do this I found is the pretty horrible hack below:

http-request set-var(txn.condition) bool(1) if { hdr_reg(User-Agent) -i "^python-requests" } ||  ! {  hdr_reg(User-Agent) -i "Wikidata" }
http-request deny if  { src -f /etc/haproxy/ipblock/abuse/unicorn } { var(txn.condition) -m bool  }

which works. In simpler cases where we don't have a negation in the OR condition, we can just use an ACL to express the OR.
So take the requestctl expression:
ipblock@cloud/nightmarehosts AND (pattern@ua/requests OR pattern@ua/wikidata)

this can be expressed both as

acl rule_n hdr_reg(User-Agent) -i "^python-requests"
acl rule_n  hdr_reg(User-Agent) -i "Wikidata"
http-request deny if  { src -f /etc/haproxy/ipblock/abuse/unicorn } rule_n

and as

http-request set-var(txn.condition) bool(1) if { hdr_reg(User-Agent) -i "^python-requests" } ||   {  hdr_reg(User-Agent) -i "Wikidata" }
http-request deny if  { src -f /etc/haproxy/ipblock/abuse/unicorn } { var(txn.condition) -m bool  }

which is easier to translate to generally once we've moved in this direction, but is less "haproxy-like".

We will also need to look ahead while parsing the expression, meaning we can no longer just get away with translating verbs in expressions in the DSL language. The haproxy translation layer promises to be exponentially more complex than the other ones we've managed up to now, unless we decide to make it extremely simplified - only support AND and OR conditions without parentheses, for instance.

because, contrary to what the documentation suggests, it's not possible to aggregate logical expressions in conditions.

The best way to do this I found is the pretty horrible hack below:

http-request set-var(txn.condition) bool(1) if { hdr_reg(User-Agent) -i "^python-requests" } ||  ! {  hdr_reg(User-Agent) -i "Wikidata" }
http-request deny if  { src -f /etc/haproxy/ipblock/abuse/unicorn } { var(txn.condition) -m bool  }

Yeah, unfortunately this is the case. You can combine ACLs in interesting ways in action expressions, but not in ACL expressions themselves.

BTW, you're probably better off using req. as the prefix instead of txn, it will live for only the scope of processing the incoming request, instead of the entire request/response pair for the transaction.

As @Fabfur points out, in haproxy 3.0 (but not haproxy 2.8.x) we have the option of evaluating many ACLs together with negation, as part of a fetching samples.

https://docs.haproxy.org/3.0/configuration.html#7-acl

There are caveats however:

acl([!]<name>[,...]) : boolean
Returns true if the evaluation of all the named ACL(s) is true, otherwise
returns false. Up to 12 ACLs may be provided, each delimited by comma. Each
named ACL may be prefixed with a "!" to invert the result. If any evaluation
produces an error then the sample also returns an error.
Note that HAProxy does not perform any validation checks on the referenced
ACLs, such as whether an ACL which uses a http request sample is used in
response context. This behavior may be changed in the future.

Joe changed the task status from Open to In Progress.Jul 22 2024, 1:29 PM
Joe claimed this task.

As @Fabfur points out, in haproxy 3.0 (but not haproxy 2.8.x) we have the option of evaluating many ACLs together with negation, as part of a fetching samples.

In the end, the whole process wasn't too bad. Right now haproxy configuration is built as follows:

  • Collect all enabled haproxy actions
  • Extract all patterns and IPblocks declared there, and turn them into ACLs. The name of the ACL is simply the pattern or ipblock with slashes turned into underscores.
  • For patterns, there's the additional complication that one pattern can correspond to multiple ACLs in haproxy. In that case, we expand one pattern into multiple rules like acl myrule_1 ... acl myrule_2. In order to remember later that we've expanded the ACL, we register all ACLs in a registry.
  • Once all these ACLs are declared, convert the names in the action's expression to the haproxy form, including transforming multi-acl patterns ( myrule -> ( myrule_1 AND myrule_2 AND... ))
  • We convert AND, OR and NOT into &, | and ~ respectively so that we can have sympy parse the expression like a boolean expression.
  • We transform the logical expression to its disjunctive normal form (DNF) so for instance A & (B | (C & ~D)) becomes (A & B) | (A & C & ~D) which is the only format in which you can express conditions in haproxy
  • We finally transform the expression to the haproxy syntax

While I understand it's quite a bit of work, it's generally not terrible as I expected.

Mentioned in SAL (#wikimedia-operations) [2024-08-06T09:41:27Z] <joe> upgrading conftool to 3.2.1 everywhere T369606