Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider following HTTP/2 semantics for env. #1878

Closed
ioquatix opened this issue Apr 26, 2022 · 7 comments
Closed

Consider following HTTP/2 semantics for env. #1878

ioquatix opened this issue Apr 26, 2022 · 7 comments

Comments

@ioquatix
Copy link
Member

ioquatix commented Apr 26, 2022

As discussed here: #1841 there might be some advantages to following a more modern standard for env keys/structure. I'm personally in favour of the header format outlined in RFC7540 and specifically section 8.1.

As outlined, a env that:

  • Uses raw HTTP headers as keys.
  • Uses pseudo HTTP headers for internal metadata (e.g. :rack.input, :rack.errors, :rack.hijack etc).

An example:

{
  ':rack.input' => ...,
  ':method' => 'GET',
  ':scheme' => 'http',
  ':authority' => 'example.com',
  ':path' => '/foo/bar',
  'accept' => '*/*'
}

Just for reference, the Rack env currently follows the model outlined by https://www.rfc-editor.org/rfc/rfc3875.html (4.1.18 specifically deals with headers).

Performance differences between any decent structure of env will mostly be negligible in real world applications. We don't want to introduce any regressions here.

The main disadvantage of any top level change to env as discussed is compatibility and bifurcating the spec.

I think if we were to implement Rack today, it's likely we'd adopt such a model. Such an approach has advantages over the current CGI style env including:

  • Consistent format for both request and response headers. This personally appeals to me a lot, simply because I think it reduces the chance of error and cognitive burden when dealing with requests and responses.
  • Easy to proxy requests - the env contains exactly the required headers.
  • Avoids lossy transform of header keys - right now we potentially have issues with incoming headers either being dropped or mapped to the same key.
  • Less snowflake/bespoke representation of HTTP semantics - while env does follow the CGI RFC, it's over 18 years old and definitely not considered to be a model for new implementations - can we modernize our approach?

There is a auxiliary issue which I've noticed, which is that Rack::Request get_header, set_header and so on are all semantically broken - they don't use env['HTTP_' key'] but rather env[key] which seems completely wrong and unexpected. This is poorly understood as shown by this great example: https://github.com/rack/rack/blame/ed308ff555ae0aeff465044be92bc6bcbc3ddaaa/lib/rack/request.rb#L120-L137. This is in contrast to the same interface on Rack::Response which adds to the confusion. The proposed change would potentially fix the issue because env would contain headers directly.

There are several potential implementation options:

  1. Complete break between, say, Rack 3 and Rack 4. Compatibility is hard problem but in theory is manageable.
  2. Compatibility Middleware / Feature Flags. Rack::Builder#options would give servers and applications enough flexibility to communicate what approach is required. The design trade off is that middleware would need to deal with different shapes of env but this would also be the most compatible. In theory, Rack::Request should deal with the internal shape of env, practically speaking right now it's not (see Rack::Request#get_header and related issues).
@ioquatix
Copy link
Member Author

@tenderlove
Copy link
Member

It makes sense to me and I think we should do it. Adopting h2 headers seems like the best path forward.

I personally like option 2. I like the idea of having the request object deal with the different env shapes for the user, but one thing that bothers me about that solution (and I believe is a problem with Rack in general) is that every middleware is then required to "cast" the env hash to a request object, then "cast" it back to a hash when it passes it on to the next middleware.

It would be really great if we could just pass a request object all the way down the stack, then each middleware doesn't need to cast. If we're going to dream big, maybe we should make the request object quack enough like a hash that it works for legacy middleware, but newer middleware can take advantage of the request object itself. I will open an issue for this with more detailed thoughts. But yes, I think this ticket is a good idea.

@jeremyevans
Copy link
Contributor

I'm against this and don't think the minor benefits are worth the backwards compatibility cost. Completely breaking backwards compatibility at this stage in rack's lifecycle makes is a very bad trade off, IMO. That goes for both h2 style env as well as replacing env with a request object.

If I had to choose between the two, I think it makes more sense to replace env with a request object as opposed to h2 style env, as they have the same backwards compatibility cost, but the benefits of a request object are larger. So if the eventual goal is a request object, then we should skip h2 style env and jump straight to that.

@ioquatix
Copy link
Member Author

ioquatix commented Apr 27, 2022

I've made a PR to try and address the first part of this, Rack::Request#*_headers are an interface for dealing with actual HTTP header names: #1880

On detailed review of the Rails code base, I noted some frustration here: https://github.com/rails/rails/blob/f95c0b7e96eb36bc3efc0c5beffbb9e84ea664e4/actionpack/lib/action_dispatch/http/headers.rb#L5-L23.

As discussed in the previous issue, #1841 (comment) and #1841 (comment) both reflected on this pain point and expressed a desire to have access to the underlying headers cc @ianks @nateberkopec

@jeremyevans is concerned about compatibility but I think we need to establish that their is a fair middle ground for evolution of Rack given that the contributors in these discussions practically speaking probably represent over 90% of all current real-world usage of Rack. Not all forward progress is going to be backward compatible and there is a desire for forward progress. In these cases, the best we can hope for, IMHO, is feature flags, middleware wrappers, etc.

I'm not personally interested in rolling this change into Rack 3.0 - I think later in the 3.x life or maybe even 4.x makes sense. That being said, I'd like to get a consensus on what this looks like going forward.

@ioquatix
Copy link
Member Author

Unfortunately the current design leads people into code like this:

    # Returns the authorization header regardless of whether it was specified directly or through one of the
    # proxy alternatives.
    def authorization
      get_header("HTTP_AUTHORIZATION")   ||
      get_header("X-HTTP_AUTHORIZATION") ||
      get_header("X_HTTP_AUTHORIZATION") ||
      get_header("REDIRECT_X_HTTP_AUTHORIZATION")
    end

@jeremyevans
Copy link
Contributor

I'm still strongly against this proposal due to the backwards compatibility issues.

@ioquatix
Copy link
Member Author

Okay, for now, let's close it and focus our energy on #1879

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants