Page MenuHomePhabricator

Provide a way to get sampled POST body logs
Open, In Progress, MediumPublic

Description

We have extensive metrics and logging for request URLs and headers, but substantially less for the bodies, which is particularly important for POST requests. POST payloads for API requests are available (sampled and redacted) in api.log, but we're comparatively blind for POSTs to /w/index.php -- they all look the same regardless of action and other parameters.

In certain incident situations, including for example DOS attacks where the attack traffic is non-API POST requests, a sampled and redacted log of POST data in Logstash would make troubleshooting much easier.

  • /w/index.php
  • rest.php

Event Timeline

I think it would be good to have that for rest.php as well.

Atieno changed the task status from Open to In Progress.May 23 2024, 3:50 PM

@BCornwall We can see some logs from here https://logstash.wikimedia.org/app/dashboards#/view/59147710-1f9e-11ec-85b7-9d1831ce7631?_g=h@bb5fdf8&_a=h@aca69c3
Have not yet replicated that calls to api have POST body. Might there be somewhere else from where we can check the logs that have the POST body?

If/when these go to logstash, they should likely go in http.request.body.content as a string:
https://www.elastic.co/guide/en/ecs/current/ecs-http.html#field-http-request-body-content

We do have ActionAPI POST params in the mediawiki.api-request stream, which is ingested into the Data Lake (Hive) in the event.mediawiki_api_request table.

Here's the code producing mediawiki.api-request events.

It'd be nice to get other kinds API requests in the Data Lake too.
T291645: Produce ECS formatted logstash logs to Event Platform, allowing them to be queried in the WMF Data Lake with SQL could help.