Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gunicorn ignores the Content-Length header when the connection is half-closed. #3234

Open
kenballus opened this issue Jul 4, 2024 · 8 comments

Comments

@kenballus
Copy link
Contributor

Description of the bug

From RFC 9112:

If a valid Content-Length header field is present without Transfer-Encoding, its decimal value defines the expected message body length in octets. If the sender closes the connection or the recipient times out before the indicated number of octets are received, the recipient MUST consider the message to be incomplete and close the connection.

Gunicorn does not enforce this rule. When it receives a request, and the sender half-closes the connection, Gunicorn responds regardless of whether the request's body has been fully received.

To reproduce

  1. Start a Gunicorn-based HTTP server that echos the message body. (e.g., something like this)
  2. Send it a request with an incomplete message body, followed by half-closing the socket, and observe that it still responds:
printf 'GET / HTTP/1.1\r\nHost: a\r\nContent-Length: 10\r\n\r\nA' | nc localhost 80
HTTP/1.1 200 OK
Server: gunicorn
Date: Thu, 04 Jul 2024 16:21:12 GMT
Connection: keep-alive
Content-type: application/json
Content-Length: 133

{"headers":[["SE9TVA==","YQ=="],["Q09OVEVOVF9MRU5HVEg=","MTA="]],"body":"QQ==","version":"SFRUUC8xLjE=","uri":"Lw==","method":"R0VU"}
  1. Decode the response to see that it accepted the incomplete message body:
printf '{"headers":[["SE9TVA==","YQ=="],["Q09OVEVOVF9MRU5HVEg=","MTA="]],"body":"QQ==","version":"SFRUUC8xLjE=","uri":"Lw==","method":"R0VU"}' \
    | jq '.["body"]' \
    | xargs echo \
    | base64 -d \
    | xxd
00000000: 41                                       A

Gunicorn version: 22.0.0
Python version: 3.11.9

@benoitc
Copy link
Owner

benoitc commented Jul 23, 2024

Body is streamed to the application. There is no reason to pass even incomplete header to the appplication. Otherwise that would force the gateway to buffer the whole body which is properly inneficient. The application beg-hind must take care of an incomplete body as a result. What do I miss there?

@kenballus
Copy link
Contributor Author

kenballus commented Jul 23, 2024

Body is streamed to the application. There is no reason to pass even incomplete header to the appplication.

I don't know what you're trying to say here.

Otherwise that would force the gateway to buffer the whole body which is properly inneficient.

Obviously, the gateway shouldn't buffer the whole request. No one is suggesting that.

The application beg-hind must take care of an incomplete body as a result.

This just doesn't make sense. The application behind must not respond to the request until it is complete. If the connection is half-closed before the request is complete, then the application should close the connection. What it should not do is pretend that the message is complete.

If you're curious, this is what other HTTP servers do:

Close the connection without responding:

  • AIOHTTP
  • H2O
  • HAProxy
  • Hyper
  • Libevent
  • Libsoup
  • Lighttpd
  • Mongoose
  • Netty
  • Nginx
  • LiteSpeed
  • Passenger
  • puma
  • Tornado
  • Twisted
  • OpenWrt uhttpd
  • Unicorn
  • Uvicorn
  • Waitress

Close the connection and respond 400:

  • Apache httpd
  • FastHTTP
  • Go net/http
  • Hypercorn
  • Jetty
  • Node.js
  • ServiceTalk
  • Tomcat
  • WEBrick

@pajod
Copy link
Contributor

pajod commented Jul 25, 2024

One existing test case even has what should be treated as truncated:

Content-Length: 17914\r\n

@benoitc
Copy link
Owner

benoitc commented Jul 26, 2024

Contrary to some other wsgi implementations, the body is fully streamed to the application when it comes. That's per design and reduce the memory footprint. I would expect that the application validate the payload it received via the gateway before handling any action. This includes to check if its partial or not.
Gunicorn has no control in the application once a chunk is given to it. This has never been à issue. Sometimes it is also useful to even handle partial content. Also when the Content length doesn't correspond gunicorn must close the connection.

However can you clarify what you mean by half closed? I don't see how a content can be returned if the socket is closed.

@kenballus
Copy link
Contributor Author

gunicorn will certainly not buffer the body for the application until it's fully read. The body is fully streamed to the application when it comes. That's per design and reduce the memory footprint. I would expect that the application validate the payload it received via the gateway before handling any action. This has never been à issue.

No one is asking for Gunicorn to buffer the body for the application. This is entirely unrelated to what I'm talking about. As usual, it feels like we're talking past each other :)

Also when the Content length doesn't correspond gunicorn must close the connection.

When the Content-Length is too large for the received data, Gunicorn should treat the request as invalid, respond 400, and close the connection. It doesn't currently do this.

However can you clarify what you mean by half closed? I don't see how a content can be returned if the socket is closed.

A TCP connection is half-closed when one side has sent a FIN and the other side hasn't. The relevant section of the TCP RFC.

@benoitc
Copy link
Owner

benoitc commented Jul 26, 2024

When the Content-Length is too large for the received data, Gunicorn should treat the request as invalid, respond 400, and close the connection. It doesn't currently do

what do you mean by "the Content-Length is too large for the received data" ?

I am expectinv that the connection is closed by Gunicorn if it continue to receive some data passed the content length. If it's not the case this is a bug indeed.

Is this what you mean?

@kenballus
Copy link
Contributor Author

I am expectinv that the connection is closed by Gunicorn if it continue to receive some data passed the content length. If it's not the case this is a bug indeed.

I mean the opposite of this. Consider the example request from the first message. The Content-Length value is 10, but the message body is only 1 byte long. This should elicit a 400 response, but it doesn't.

@benoitc
Copy link
Owner

benoitc commented Aug 6, 2024

I don't see how this could work . At that time we already passed some information to the application in the start_response handler 'gunicorn doesn't buffer). The application need to be aware we are closing this. Which is not planned with current expectation. Instead I would expect the application is taking care about what it read, not the gateway witch is only passing the data through to the application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants