Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mechanism to route based on the escaped path #184

Merged
merged 3 commits into from
Aug 24, 2016

Conversation

kushmansingh
Copy link
Contributor

No description provided.

@kushmansingh
Copy link
Contributor Author

The new test case fails on < go1.5. Please advise on how to skip this case for those versions.

@kisielk
Copy link
Contributor

kisielk commented Aug 16, 2016

I think you need to have a separate getPath for versions < go1.5

@elithrar
Copy link
Contributor

Correct. You need to put it in its own file with a build tag so that it
only builds on go 1.5.

sometimes it can make sense to copy std lib code in to backport the
behavior, with the risk that you don't benefit from upstream fixes. This
can be OK for small functions/logic.
On Tue, Aug 16, 2016 at 11:42 AM Kamil Kisiel [email protected]
wrote:

I think you need to have a separate getPath for versions < go1.5


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#184 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AABIcJBLG3CuswYDFqF2FHZ6-Io5VukXks5qggRwgaJpZM4Jlh1O
.

@kushmansingh
Copy link
Contributor Author

kushmansingh commented Aug 17, 2016

I actually found the way recommended to do it in < 1.5 as mentioned here: https://golang.org/pkg/net/url/#URL

I'm essentially doing what the new URL.EscapedPath() function does.

Going to now see if I can replace the regex matching with simple string slicing.

@kisielk
Copy link
Contributor

kisielk commented Aug 17, 2016

LGTM. @elithrar ?

// as detailed here as detailed in https://golang.org/pkg/net/url/#URL
// for < 1.5 server side workaround
// http://localhost/path/here?v=1 -> /path/here
re := regexp.MustCompile(req.URL.Scheme `://` req.URL.Host)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a panic be triggered here? punycode and RFC compliant hosts should not (I have not tested exhaustively...), but worried about adding MustCompile to the request critical path.

@elithrar
Copy link
Contributor

One reasonably key comment, and the follow-up question: could this be done in middleware prior to mux handling the request?

@elithrar elithrar self-assigned this Aug 17, 2016
@kushmansingh
Copy link
Contributor Author

I realized the regex was unnecessary and would probably have large performance impacts, the parsing is cleaner and tighter now with just some simple indexing. As for the middleware, are you suggesting we forcibly assign req.URL.Path the escaped path we extracted? I think that would be unwise as it would break what is expected in req.URL.Path as specified by the stdlib.

@elithrar
Copy link
Contributor

Running some benchmarks on this (just for documentation) and running some further testing. I was checked out into this branch and it broke an API proxy I had running for a project, so want to replicate and ensure we are testing more thoroughly.

("blocked")

// http://localhost/path/here?v=1 -> /path/here
iStart := len(req.URL.Scheme `://` req.URL.Host)
path := req.RequestURI[iStart:]
if i := strings.LastIndex(path, "?"); i > -1 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handles URLs in the form scheme://host/path?query and scheme://host/path?query#fragment, but not scheme://host/path#fragment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have the raw query string and fragment can we just get the length of those and strip that off via a slice as well?

@kushmansingh
Copy link
Contributor Author

@elithrar Can you provide a test case/situation that's causing the breakage?

@elithrar
Copy link
Contributor

@kushmansingh Will try to get something to you tonight - it's easy to replicate but want to dig into why first.

@elithrar
Copy link
Contributor

elithrar commented Aug 23, 2016

OK - with the tip of this branch installed: run https://gist.github.com/elithrar/51399e558e517bade6c29fed7ee72dbd (a minimal version of my API proxy with all of the CORS/RW wrapper pulled out for simplicity).

mux is panicking when parsing the URL. The master branch does not suffer this problem.

2016/08/23 05:43:52 listening on host 127.0.0.1 - port 8000
2016/08/23 05:43:54 method="GET" req="/"
2016/08/23 05:43:54 http: panic serving 127.0.0.1:55787: runtime error: slice bounds out of range

Stack trace:

github.com/gorilla/mux.getPath(0xc4200d21e0, 0xc42003fbf8, 0x854ed)
        /Users/matt/.go/src/github.com/gorilla/mux/mux.go:372  0x18f
github.com/gorilla/mux.(*Router).ServeHTTP(0xc420012780, 0x3b0e80, 0xc4200f2000, 0xc4200d21e0)
        /Users/matt/.go/src/github.com/gorilla/mux/mux.go:80  0x21a
main.testlog.func1(0x3b0e80, 0xc4200f2000, 0xc4200d21e0)

github.com/gorilla/mux.getPath(0xc420114000, 0xc42003fbf8, 0x854ed)
        /Users/matt/.go/src/github.com/gorilla/mux/mux.go:372  0x18f
github.com/gorilla/mux.(*Router).ServeHTTP(0xc420012780, 0x3b0e80, 0xc4200f20d0, 0xc420114000)
        /Users/matt/.go/src/github.com/gorilla/mux/mux.go:80  0x21a

mux.go:372 (mux.go:80 just calls getPath)

path := req.RequestURI[iStart:]

@elithrar
Copy link
Contributor

@kushmansingh see above (just in case GitHub chooses not to send you the notification).

@kushmansingh
Copy link
Contributor Author

@elithrar Added a test case to replicate the condition, everything should be good now hopefully 😄

@elithrar
Copy link
Contributor

Thanks. LGTM to me and the updated branch no longer panics.

  • @kisielk - if you agree, merge at will!

@kisielk
Copy link
Contributor

kisielk commented Aug 24, 2016

I'm in Africa for the next 3 weeks so won't be reviewing code. If it looks
good, just go ahead.
On Tue, Aug 23, 2016 at 22:30 Matt Silverlock [email protected]
wrote:

Assigned #184 #184 to @kisielk
https://github.com/kisielk.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#184 (comment), or mute the
thread
https://github.com/notifications/unsubscribe-auth/AADk-lBig2wGntLOmNIwMq4Kmdu_jkU-ks5qi0pUgaJpZM4Jlh1O
.

@elithrar elithrar merged commit 674ef1c into gorilla:master Aug 24, 2016
@dmitshur
Copy link
Contributor

dmitshur commented Aug 28, 2016

Some of my code stopped working after this change. I came to this PR to understand the change better and to figure out if the problem was in my code, or in this change.

In general, I don't mind breaking changes in APIs, as long as they are done intentionally and lead to an improvement, so it's not a problem. But I wish this PR was documented better - it was hard to figure out why it was done and what it attempted to accomplish.

After looking around, I think I've figured out what happened, so I'll offer some suggestions for improvement (for next time) and explain for posterity in case others run into the same issue as I did:

  • The title is "Add mechanism to route based on the escaped path".
    • That's somewhat misleading. It doesn't add a new optional mechanism, it changes the actual routing to be based on the escaped path.
  • The PR description is "No description provided."

The reason my code stopped working (all routes stopped matching) is because I was modifying the req.URL.Path (stripping off a prefix) in my handler before passing it off to a gorilla/mux router (I'm still not quite sure if that's a valid use of this library or I shouldn't really be doing that). However, after this change, it's not enough to only modify req.URL.Path, one must also modify req.RequestURI (or simply clear it so that it's not used by getPath in this PR).

I'll note that I'm thankful to the developers/maintainers of this library for doing a great job overall, and to the PR creator for working on resolving an issue. This comment is simply to provide what I hope is helpful additional information and suggestions for improvement for next time. I'm not trying to be negative. Thanks!

dmitshur added a commit to shurcooL/home that referenced this pull request Aug 28, 2016
This is needed after change in gorilla/mux#184
started using req.RequestURI rather than req.URL.Path to determine path
for route matching.

Clearing req.RequestURI is a simple fix that works, but maybe I'll want
a more sophisticated fix later that modifies req.RequestURI to match
req.URI.Path modification. Or look into doing routing at URL path-level
rather than HTTP request-level.
@elithrar
Copy link
Contributor

I appreciate the feedback @shurcooL - this definitely was not intended as a breaking change, but (as you detailed) ended up being one due to oversight.

As an aside: mux has a lot of users, and an API that we are trying to keep useful-yet-flexible. The Issues and PR tabs are littered with rejections of features, but perhaps not as many rejections as I'd like (given 20/20 hindsight!). That's not unique for an OSS project though!

@neelance
Copy link
Contributor

As @shurcooL already correctly described, this change assumes that req.RequestURI is always in sync with req.URL.Path. This assumption does not even hold in Go's standard library, for example https://golang.org/pkg/net/http/#StripPrefix modifies req.URL.Path, but not req.RequestURI. Also, modifying the req.RequestURI field does not really fit to its documentation: "RequestURI is the unmodified Request-URI of the Request-Line (RFC 2616, Section 5.1) as sent by the client to a server."

This change probably breaks a lot of exiting uses of gorilla/mux and imho reverting this change should be considered until there is a better solution.

@elithrar
Copy link
Contributor

elithrar commented Aug 29, 2016

@neelance Would you be able to submit a PR that reverts this? (it's not a one-click revert). I won't have time until tonight or tomorrow (at the earliest).

(or, at the least, open an issue with your comment above)

@kushmansingh
Copy link
Contributor Author

Reversion is probably the fastest solution if this is breaking for a lot of users. Here are some options for how to get this behavior functioning:

  1. The easiest; drop support for anything below 1.5. This is not immediately viable but I think there is merit in encouraging users to upgrade since the standard library has changed a significant amount in net/http
  2. Provide an optional middleware which must be used as first in the chain. This middleware will parse RequestURI and place the escaped path into req.URL.Path. This breaks the assumption provided by the standard lib about req.URL.Path being unescaped already but would allow other middleware down the line to still function and route.
  3. Provide a flag on the basis of which either the parsed RequestURI will be used or req.URL.Path. If this flag was on, then other middleware would break the way that @shurcooL and @neelance are currently experiencing but it would be a very explicit "TURN THIS ON AND STUFF MIGHT BREAK" flag.

I'm open to any other solutions and comments and I apologize for the inconvenience caused by this breakage.

@calebcase
Copy link

It appears the URL.Path is broken for any use as a routing path. Even in newer version of Go it contains an 'unescaped' path, but they don't take any precautions to only unescape valid characters in a segment. The result is a path that can contain '/' where it should NOT exist.

https://play.golang.org/p/j41wEQeMM-

package main

import (
    "fmt"
    "net/url"
)

func main() {
    raw := "/foo/bar/baz"
    u, _ := url.Parse(raw)

    // Default behaviours:
    fmt.Printf("[default] Path: %s\nRawPath: %s\n\n", u.Path, u.RawPath)
    fmt.Printf("[default] EscapedPath: %s\n\n", u.EscapedPath())

    // What happens if Path is manually overwritten:    
    u.Path = "/fun/times/baz"
    fmt.Printf("[path   ] EscapedPath: %s\n\n", u.EscapedPath())

    // Must overwrite both to get the right behavior (including creating Path with the invalid unencoded value):
    u.Path = "/fun/times/baz"
    u.RawPath = "/fun/times/baz"
    fmt.Printf("[both   ] EscapedPath: %s\n", u.EscapedPath())
}

Result:

[default] Path: /foo/bar/baz
RawPath: /foo/bar/baz

[default] EscapedPath: /foo/bar/baz

[path   ] EscapedPath: /fun%2Ftimes/baz

[both   ] EscapedPath: /fun/times/baz

The documentation examples for ServeMux imply that request.URL.Path is a public interface suitable for overriding (e.g. StripPrefix ). It probably shouldn't since this won't result in proper behavior.

Gorilla Mux should be clear in the docs about what value is used to route on so users can use that as part of the public API. The tricky bit here is that Path and RawPath must both be in sync or EscapedPath() will not work as expected (see above example). It should also be made clear if the paths given to routers need to be URL encoded themselves.

@neelance
Copy link
Contributor

@calebcase The documentation of URL.Path recognizes this fact (https://golang.org/pkg/net/url/#URL):

Note that the Path field is stored in decoded form: /Go/ becomes /Go/. A consequence is that it is impossible to tell which slashes in the Path were slashes in the raw URL and which were /. This distinction is rarely important, but when it is, code must not use Path directly.

So I wouldn't say that "URL.Path is broken for any use as a routing path". They chose to ignore this edge case for simplicity. If your application depends on this edge case, then unfortunately you have to roll your own implementation based on URL.RawPath or Request.RequestURI instead of relying on the standard library's tools. I think the same applies to gorilla/mux.

@elithrar
Copy link
Contributor

The easiest; drop support for anything below 1.5. This is not immediately viable but I think there is merit in encouraging users to upgrade since the standard library has changed a significant amount in net/http

Possible, but not now. gorilla/mux has been around for a long while and Ubuntu/Debian, et. al, still package < Go 1.5 (horrible, but reality).

Provide an optional middleware which must be used as first in the chain. This middleware will parse RequestURI and place the escaped path into req.URL.Path. This breaks the assumption provided by the standard lib about req.URL.Path being unescaped already but would allow other middleware down the line to still function and route.

This would likely live outside of mux, since mux doesn't have an API for middleware. Alternatively, it would work like StrictPath as a method on *Router. This is probably my preferred approach: opt-in, with docs around what this might break. It does add to the API however.

Provide a flag on the basis of which either the parsed RequestURI will be used or req.URL.Path. If this flag was on, then other middleware would break the way that @shurcooL and @neelance are currently experiencing but it would be a very explicit "TURN THIS ON AND STUFF MIGHT BREAK" flag.

As above (so yes, I agree).

@dmitshur
Copy link
Contributor

dmitshur commented Aug 29, 2016

Here's an idea, but I'm unsure if it's viable or not.

If you look at how https://godoc.org/net/url#URL.EscapedPath works:

// EscapedPath returns the escaped form of u.Path.
// In general there are multiple possible escaped forms of any path.
// EscapedPath returns u.RawPath when it is a valid escaping of u.Path.
// Otherwise EscapedPath ignores u.RawPath and computes an escaped
// form on its own.

Can we do something similar and detect if req.URL.Path is a valid/possible encoding of req.RequestURI, and only use req.RequestURI in that case? Otherwise defer to req.URL.Path?

@elithrar
Copy link
Contributor

My preference is to be explicit rather than implicit ('clever').

@kushmansingh Are you able to submit a PR that (my preference) adds a UseEscapedPath method on *Router?

@kushmansingh
Copy link
Contributor Author

@elithrar I'll be working on it and should have one up in the next day or two. I'll post updates if anything changes.

@elithrar
Copy link
Contributor

See #190 - feedback welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants