Pip vs Package Manager for handling Python Packages
The biggest disadvantage I see with using pip
to install Python modules on your system, either as system modules or as user modules, is that your distribution’s package management system won’t know about them. This means that they won’t be used for any other package which needs them, and which you may want to install in the future (or which might start using one of those modules following an upgrade); you’ll then end up with both pip
- and distribution-managed versions of the modules, which can cause issues (I ran into yet another instance of this recently). So your question ends up being an all-or-nothing proposition: if you only use pip
for Python modules, you can no longer use your distribution’s package manager for anything which wants to use a Python module...
The general advice given in the page you linked to is very good: try to use your distribution’s packages as far as possible, only use pip
for modules which aren’t packaged, and when you do, do so in your user setup and not system-wide. Use virtual environments as far as possible, in particular for module development. Especially on Arch, you shouldn’t run into issues caused by older modules; even on distributions where that can be a problem, virtual environments deal with it quite readily.
It’s always worth considering that a distribution’s library and module packages are packaged primarily for the use of other packages in the distribution; having them around is a nice side-effect for development using those libraries and modules, but that’s not the primary use-case.
TL;DR
- use
pip
(+ virtualenv) for stuff (libs, frameworks, maybe dev tools) your projects (that you develop) use - use the package manager for applications you use (as an end-user)
Development dependencies
If you're developing software in Python, you'll want to use pip
for all of the project's dependencies, be they runtime dependencies, build-time dependencies or stuff needed for automated testing and automated compliance checks (linter, style checker, static type checker ...)
There are several reasons for this:
- This allows you to use virtualenv (either directly or through virtualenvwrapper or pipenv or other means) to separate dependencies of different projects from each other and to isolate the python applications you use "in production" (as a user) from any exotic shenanigans (or even just incompatibilities) that may go on in development.
- This allows you to track all of a project's dependencies in a
requirements.txt
(if your project is an application) orsetup.py
(if your project is a library or framework) file. This can be checked into revision control (e.g. Git) together with the source code, so that you always know which version of your code relied on what versions of your dependencies. - The above enables other developers to collaborate on your project even if they don't use the same Linux distribution or not even the same operating system (if the used dependencies are also available on Mac and Windows or whatever they happen to use, that is)
- You don't want automatic updates of your operating system's package manager to break your code. You should update your dependencies, but you should do so consciously and at times you choose, so that you can be ready to fix your code or roll back the update. (Which is easy if you track the complete dependency declaration in your revision control system, together with your code.)
If you feel you need to separate direct and indirect dependencies (or distinguish between acceptable version range for a dependency and actual version used, cf. "version pinning") look into pip-tools and/or pipenv. This will also allow you to distinguish between build and test dependencies. (The distinction between build and runtime dependencies can probably be encoded in setup.py
)
Applications you use
For stuff that you use as normal application and that just happens to be written in Python, prefer your operating system's package manager. It'll make sure it stays reasonably up-to-date and compatible to other stuff installed by the package manager. Most Linux distributions will also assert that they don't distribute any malware.
If something you need isn't available in your distribution's default package repo, you can check out additional package repos (e.g. launchpad of deb-based distros) or use pip
anyway. If the latter, use --user
to install into your user's home instead of system-wide, so that you're less likely to break your Python installation. (For stuff you only need temporarily or seldom, you may even use a virtualenv.)
Another reason to go with the package manager is that updates will be automatically applied which is critical for security. Think if the beans package Equifax used had been automatically updated via yum-cron-security, the hack may not have happened.
On my personal dev box I use Pip, in prod I use packages.