Umbrella task to follow development of haproxykafka, a software meant to replace Benthos (currently just in testing phase in ulsfo) to parse HAProxy logs over socket and send to specific Kafka topic
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
In Progress | Fabfur | T351117 Move analytics log from Varnish to HAProxy | |||
Open | Fabfur | T370668 New software: haproxykafka | |||
Resolved | Fabfur | T372338 haproxykafka: feature: Add prometheus metrics | |||
Declined | Fabfur | T372341 haproxykafka: feature: Read from TCP UDS socket | |||
Resolved | Fabfur | T372339 haproxykafka: feature: add ability to add/remove/modify fields | |||
Resolved | Fabfur | T372342 haproxykafka: feature: Configuration file | |||
Open | Fabfur | T372344 haproxykafka: feature: Ability to print structured messages to stdout | |||
Open | Fabfur | T374128 haproxykafka features | |||
Open | Fabfur | T374473 Prepare puppet configuration to send haproxy logs to haproxykafka socket | |||
Open | Fabfur | T374696 Enable prometheus metrics scraping for haproxykafka |
Event Timeline
I might be out of my league here, but have yall considered the haproxy Stream Processing Offload Engine?
It looks like a built in way to hook into haproxy requests and do whatever you need. This might be more efficient than listening to a UDP socket with syslog formatted messages. There looks to be SPOE agent libraries in C. Maybe you could even re-use some varnishkafka code to bridge between the SPOE protocol and librdkafka?
https://github.com/negasus/haproxy-spoe-go could be handy if we go down that road and we don't want to get dirty writing C code :)
Hi Andrew, this is definitely something worth considering, but I would need a serious help w/ C.
ATM we decided to go down the road of rewriting atskafka in it's semplicity to adapt it to HAProxy with structured log format (RFC5424),
In the meantime I'll investigate into SPOE... thanks!
I would need a serious help w/ C.
Ya, me too! Perhaps the SPOE go lib @Vgutierrez mentioned might be easier?
ATM we decided to go down the road of rewriting atskafka in it's semplicity to adapt it to HAProxy with structured log format (RFC5424),
If it works, then it works! I remember when we hired Magnus Edenhill to write varnishkafka, he did a lot of very specific performance and memory focused tuning to make sure stuff would work well. (and um, and I wonder if we could contract him out again? :D Probably not; he's more famous now.)
In either case, we'll be using librdkafka underneath. Perhaps RFC5424 format vs SPOP won't really make much of a performance difference.
Change #1072158 had a related patch set uploaded (by Fabfur; author: Fabfur):
[operations/puppet@production] Fixed the haproxykafka uds path to reflect test configuration
Change #1072158 merged by Fabfur:
[operations/puppet@production] cache:haproxy: fixed the haproxykafka uds path
Change #1072172 had a related patch set uploaded (by Fabfur; author: Fabfur):
[operations/puppet@production] hiera: disabling haproxy logging to socket (haproxykafka)
Change #1072172 merged by Fabfur:
[operations/puppet@production] hiera: disabling haproxy logging to socket (haproxykafka)
Mentioned in SAL (#wikimedia-operations) [2024-09-11T14:14:25Z] <fabfur> reverted 1072172 and repooling cp4037 (T370668)
Change #1072484 had a related patch set uploaded (by Fabfur; author: Fabfur):
[operations/puppet@production] hiera: continue haproxykafka tests on cp4037
Change #1072484 merged by Fabfur:
[operations/puppet@production] hiera: continue haproxykafka tests on cp4037
Change #1072577 had a related patch set uploaded (by Fabfur; author: Fabfur):
[operations/puppet@production] cache:haproxy: hardcode $schema field
Change #1072577 merged by Fabfur:
[operations/puppet@production] cache:haproxy: hardcode $schema field
Change #1074414 had a related patch set uploaded (by Fabfur; author: Fabfur):
[operations/puppet@production] Renamed log field for pipeline migration (haproxykafka)