Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg_util leaves broken packages on uninstall #314

Open
jaraco opened this issue May 20, 2017 · 15 comments
Open

pkg_util leaves broken packages on uninstall #314

jaraco opened this issue May 20, 2017 · 15 comments

Comments

@jaraco
Copy link
Member

jaraco commented May 20, 2017

I've just encountered another issue related to pkg_util-based namespace packages on pre-PEP 420 packages. Uninstallation of one package in a namespace breaks the package for all other in that namespace. Consider:

$ virtualenv --python python2.7 env
Running virtualenv with interpreter /usr/local/bin/python2.7
New python executable in /Users/jaraco/env/bin/python2.7
Also creating executable in /Users/jaraco/env/bin/python
Installing setuptools, pip, wheel...done.
$ env/bin/pip install -q backports.unittest_mock backports.datetime_timestamp
$ env/bin/pip uninstall -y backports.datetime_timestamp                  
Uninstalling backports.datetime-timestamp-1.1:
  Successfully uninstalled backports.datetime-timestamp-1.1
$ env/bin/python -c "import backports.unittest_mock"                      
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named backports.unittest_mock

The issue occurs because the uninstallation step causes the removal of the now necessary __init__.py module for the backports package. The issue doesn't occur on late releases of Python 3.3 because the package gets implicitly converted to a PEP 420 namespace package (during the uninstallation step), which in some ways is worse, because the backports package will vacillate between PEP-420 and pkg_util-managed depending on whether the last operation was an install or uninstall.

In some sense, this issue is an issue with pip and the way it installs packages all into the same location, rather than keeping distribution packages somehow separated the way eggs sought to do, but it's also a symptom of using only pkg_util for namespace package management, an issue that was addressed by setuptools' installation of the -nspkg.pth files.

@theacodes
Copy link
Member

That is unfortunate behavior. What should we do here to resolve this?

@jaraco
Copy link
Member Author

jaraco commented May 20, 2017

I think the solution here is pretty simple:

  1. Continue to recommend pkg_util as the Python 2 compatible mechanism for namespace package modules.
  2. Remove the setuptools requirement that requires declared namespaces to use the pkg_resources technique.
  3. Recommend that packages once again declare their namespace packages in setup.py such that setuptools will omit the __init__.py and install a -nspkg.pth file, which aligns the installed packages more closely with the PEP 420 design, but also provides independence of the technique when pip installed.

@jaraco
Copy link
Member Author

jaraco commented May 20, 2017

The downside to the above approach is that we continue to be entrenched in the issues surrounding package discovery (open issue with find_packages plus redundancy between defining packages and namespace_packages).

@jaraco
Copy link
Member Author

jaraco commented May 20, 2017

Another possibility is that we could work on enhancing pip to avoid removing init.py files containing pkg_util declarations unless that's the only file remaining.

@jaraco
Copy link
Member Author

jaraco commented May 20, 2017

Now that I think about it a little bit more, this first suggestion is probably subject to all of the same issues that dissuaded us from using the PKG_resources technique.

@theacodes
Copy link
Member

theacodes commented May 20, 2017 via email

@dstufft
Copy link
Member

dstufft commented May 20, 2017 via email

@theacodes
Copy link
Member

See pypa/pip#4504

@ncoghlan
Copy link
Member

ncoghlan commented May 21, 2017

The distro level solution to this is to recommend that namespaces always have a clear owner (synthesizing one at the distro level if it doesn't already exist upstream), and then the contributors to the namespace all depend on the owning package.

That way contributing packages can be freely installed and uninstalled, while if you uninstall the owning package, that will show up as a missing dependency for all of the contributing packages.

Fixing that advice in the guide would be a matter of changing the following paragraph to recommend the "owning package"/"contributing package that depends on the owning package" split:

Every distribution that uses the namespace package must include an identical init.py. If any distribution does not, it will cause the namespace logic to fail and the other sub-packages will not be importable. Any additional code in init.py will be inaccessible.

And this one to mention that instead of generating an __init__.py file, setuptools will generate <name>-nspkg.pth entries that generate appropriate sys.modules entries at Python startup:

Every distribution that uses the namespace package must include an identical init.py. If any distribution does not, it will cause the namespace logic to fail and the other sub-packages will not be importable. Any additional code in init.py will be inaccessible.

We'd then add a section at the end about "Namespaces without an owning package" that says:

  • regardless of the technical aspects, it's helpful for public namespace packages to have a clear owning package on PyPI that defines and documents the policies around appropriate use of that namespace
  • it's technically fine for native namespace packages to skip having an owning package (since there's no __init__.py file to fight over)
  • it's technically fine for pkg_resources-style namespace packages to skip having an owning package (since they rely on the non-conflicting <name>-nspkg.pth files, rather than __init__.py), but this freedom comes at the cost of eagerly initialising the namespaces at Python startup, rather than on-demand when attempting to import something from the namespace (not a problem for application specific virtual environments, potentially problematic for shared multi-application environments like those in a Linux distribution)
  • pkgutil style namespace packages effectively require having an owning package, as otherwise the __init__.py file will be removed the first time any contributing package is uninstalled

@theacodes
Copy link
Member

theacodes commented May 22, 2017

@ncoghlan thanks for that - I definitely agree it's a good idea to recommend an "owning" package for each namespace that documents which method to use and the policies around it. I'll add that to my work items.

However, I think pkgutil style namespace packages effectively require having an owning package, as otherwise the __init__.py file will be removed the first time any contributing package is uninstalled isn't quite right. The test I wrote for pip over at pypa/pip#4504 seems to indicate that even with an owning package, uninstalling any contributing package will break it.

@theacodes theacodes self-assigned this May 22, 2017
@ncoghlan
Copy link
Member

@jonparrott When there's an owning package that all the contributing packages depend on, then the contributing packages should not need to include the __init__.py file for the namespace. If they do, then you get the undesirable "first uninstall breaks the namespace" behaviour.

That's how the distro level approach works - the owning package is the only one that includes the __init__.py file, and the package build process omits it from all of the other contributing packages.

@dstufft
Copy link
Member

dstufft commented May 23, 2017

Due to hysterical raisins, I think pip uninstalling namespace packages has always been somewhat painful. If you install a project with setuptools (WITHOUT pip), the only record of what files have been installed is top_level.txt, which list the top level packages, so IIRC we just blow away the entire top level if it's listed in top_level.txt.

We probably need to refactor that, and possibly either deprecate/remove uninstalling setuptools/easy_install installed packages OR get it so setuptools drops a record file by default.

@theacodes
Copy link
Member

@ncoghlan ah, I see now. Hmmm. I think this might make things complicated for pip install -e as the user would need the owning package installed for that.

After reading @dstufft's reply, I'm unsure what to do here. I don't know if the guide should change it's recommendations at this point for namespaces, considering "fixing" the "bug" in pip should make recommending owning packages irrelevant. However, suggesting owning packages might be useful in some cases but difficult in others.

A compromise could be to add a new section about owning packages at the end of the guide, so that we can document the benefits and drawbacks and allow users to decide. This sort of thing is totally acceptable for this kind of doc.

@ncoghlan
Copy link
Member

I'm fine with starting with an appended section, while leaving the main recommendations alone for now.

I'm not sure about the pip -e concern, though - editable installs always need their dependencies installed, and in this case, the owning package becomes just another dependency. The one case I'm aware of that does become more complicated is when the contributing package otherwise has no dependencies.

@theacodes
Copy link
Member

I ran into some trouble trying to get owning packages to work for everything. Installing a pkg_resources owning package broke my Python 3 environment in a spectacular fashion:

Failed to import the site module
Traceback (most recent call last):
  File "/Users/jonwayne/workspace/sample-namespace-packages/env/lib/python3.6/site.py", line 703, in <module>
    main()
  File "/Users/jonwayne/workspace/sample-namespace-packages/env/lib/python3.6/site.py", line 683, in main
    paths_in_sys = addsitepackages(paths_in_sys)
  File "/Users/jonwayne/workspace/sample-namespace-packages/env/lib/python3.6/site.py", line 282, in addsitepackages
    addsitedir(sitedir, known_paths)
  File "/Users/jonwayne/workspace/sample-namespace-packages/env/lib/python3.6/site.py", line 204, in addsitedir
    addpackage(sitedir, name, known_paths)
  File "/Users/jonwayne/workspace/sample-namespace-packages/env/lib/python3.6/site.py", line 173, in addpackage
    exec(line)
  File "<string>", line 1, in <module>
  File "<frozen importlib._bootstrap>", line 557, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'

I'm going to put this lower on my priority for now. It seems to me that the issue with the current recommendation (uninstalling) is less severe than the potential drawbacks of owning packages (broken environment). Happy to keep this open until either pip is fixed or something else moves here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
@dstufft @theacodes @jaraco @ncoghlan and others