Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

404 for new Github repos with main branch instead of master if only repo URL is provided #962

Open
willirath opened this issue Oct 23, 2020 · 8 comments

Comments

@willirath
Copy link

Putting, e.g., https://github.com/willirath/example_repo_with_main_branch into https://nbviewer.jupyter.org landing page leads to https://nbviewer.jupyter.org/github/willirath/example_repo_with_main_branch/tree/master/ which reports 404 as the correct URL would be https://nbviewer.jupyter.org/github/willirath/example_repo_with_main_branch/tree/main/.

@willirath willirath changed the title 404 for new Github repos with main branch instead of master only the repo URL is provided 404 for new Github repos with main branch instead of master if only repo URL is provided Oct 23, 2020
@shreddd
Copy link
Contributor

shreddd commented Oct 28, 2020

Seeing this issue as well

Looks like the github api supports a way to query the "default_branch"
See: https://docs.github.com/en/free-pro-team@latest/rest/reference/repos#get-a-repository

this shows up in a couple of different places in the nbviewer code so it might be worth discussing what the right approach is. cc: @krinsman

@krinsman
Copy link
Collaborator

To clarify, is main the only branch of the repo? And the repos have no master branch? @shreddd @willirath

Are there standard naming conventions for principal branches of repositories which don't have master branches?

I'm sure it would be possible to put in some re-direct logic manually:

https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L144

https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L521

(GitHub enterprise) https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L542

i.e. some try/except clause redirecting to main instead of master if master doesn't exist.

My concern is that this would complicate the code and/or make it difficult to maintain if there are no standardized conventions for the name of the principal branch of repository which does not have a master branch.

E.g. Is this for Mercurial repositories using hg-git or something like that? https://github.com/stephenmcd/hg-github

@shreddd
Copy link
Contributor

shreddd commented Oct 30, 2020

So I don't think you want to assume the name at all. Ideally it should look up the "default_branch" field via the github API and use that to generate the redirect.

eg.

curl  -H "Accept: application/vnd.github.v3 json"  https://api.github.com/repos/octocat/hello-world

{
  "id": 1296269,
  "node_id": "MDEwOlJlcG9zaXRvcnkxMjk2MjY5",
  "name": "Hello-World",
  "full_name": "octocat/Hello-World",
  ...
   "default_branch": "master",
  ...
}

This way you don't have to make assumption about the default branch name. In other words, before each of these calls, just check the API for the right default_branch name and plug that in.

@krinsman
Copy link
Collaborator

krinsman commented Nov 2, 2020

Oh OK yeah that makes sense, I didn't know that was a feature/possibility. I agree with you, that definitely sounds like the way to do it.

This might use up some additional GitHub API requests (of the 60 per hour available to those using NBViewer without GitHub authentication set up), but I would guess that's probably acceptable for anyone who wants to view a repo like this.

https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/client.py#L106

^ Using the github_api_request method of the AsyncGitHubClient class, and the API call you gave above @shreddd , I imagine it shouldn't be too difficult (admittedly famous last words) to create a new method for the AsyncGitHubClient class (get_default_branch or something like that). E.g. see any of the get_foo methods of that class for a template/example

https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/client.py#L120

And then making a PR which uses this new AsyncGitHubClient method, and a try/except clause or something similar for handling a 404 error,
e.g. maybe except web.HTTPError(404):, compare https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L409

and then modify the GitHub provider to use this new method of the GitHub client to get the default branch when necessary? or maybe even to always get the default branch (even though it's usually master), to have a guarantee of always avoiding this kind of 404 error? (e.g. maybe modify GitHubRepoHandler to always call get_default_branch and then define the redirect URL using that? https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L144)

e.g. compare these lines of code
https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L122
https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L204
https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L344

I probably don't have time to implement such a PR myself, especially since I forgot how to set up a testing setup for modified versions of the NBViewer source code. Although maybe it wouldn't be too hard for me to try to follow the directions I wrote for myself here? https://github.com/krinsman/examples/tree/master/docker_examples/jupyterLabhubImages/nbviewer I'm not sure.

In any case I hope the above information is enough to make writing a PR for someone else to be much easier/quicker? (Compared to if they had to find the most likely relevant sections of code themselves?)

If you have any questions let me know, and I can review any submitted PR and maybe also merge it if we can verify it doesn't break anything (which it probably wouldn't).

@shreddd
Copy link
Contributor

shreddd commented Nov 5, 2020

OK - see PR #968 for an initial fix. This is simpler than what you suggested but it seems to work for me.

I also couldn't figure out whether uri_rewrites ever gets called for the user/repo case - I don't think it is, in which case you might drop that rule.

@shreddd
Copy link
Contributor

shreddd commented Nov 16, 2020

@krinsman - let me know if there are updates here

@erwanp
Copy link

erwanp commented Jan 2, 2021

I seem to be having the same problem on my repo where main is the main branch. Was there any progress ?

@thomas-bc
Copy link

Seems this was fixed 2993597
And https://nbviewer.org is redirecting correctly with the OP's repo. This issue can probably be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants