python paths and import order
Even though the above answers regarding the order in which the interpreter scans sys.path
are correct, giving precedence to e.g. user file paths over site-packages
deployed packages might fail if the full user path is not available in the PYTHONPATH
variable.
For example, imagine you have the following structure of namespace packages:
/opt/repo_root
- project # this is the base package that brigns structure to the namespace hierarchy
- my_pkg
- my_pkg-core
- my_pkg-gui
- my_pkg-helpers
- my_pkg-helpers-time_sync
The above packages all have the internal needed structure and metadata in order to be deployable by conda, and these are also all installed. Therefore, I can open a python shell and type:
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py
will return some path in the python interpreter's site-packages
subfolder. If I manually add the package to be imported to PYTHONPATH
or even to sys.path
, nothing will change.
>>> import os
>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(os.environ['PYTHONPATH'], "/opt/repo_root/my_pkg-helpers-time_sync")
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py
still returns that the package has been imported from site-packages
. You need to include the whole hierarchy of paths into PYTHONPATH
, as if it was a traditional python package, and then it will work as you expect:
>>> import os
>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(
... os.environ['PYTHONPATH'],
... "/opt/repo_root",
... "/opt/repo_root/project",
... "/opt/repo_root/project/my_pkg",
... "/opt/repo_root/project/my_pkg-helpers",
... "/opt/repo_root/project/my_pkg-helpers-time_sync"
... )
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/opt/project/my_pkg/helpers/time_sync/__init__.py
after importing a module, python first searches from sys.modules
list of directories.
if it is not found, then it searches from sys.path
list of directories. There might be other lists python search for on your operating system
import time , sys
print (sys.modules)
print (sys.path)
output is lists of directories:
{... , ... , .....}
['C:\\Users\\****', 'C:\\****', ....']
time
module is imported in accordance with the order of sys.modules
and sys.path
lists.
This page is a high Google result for "Python import order", so here's a hopefully clearer explanation:
- https://docs.python.org/library/sys.html#sys.path
- https://docs.python.org/tutorial/modules.html#the-module-search-path
As both of those pages explain, the import
order is:
- Built-in python modules. You can see the list in the variable
sys.modules
. - The
sys.path
entries. - The installation-dependent default locations.
And as the sys.path
doc page explains, it is populated as follows:
- The first entry is the FULL PATH TO THE DIRECTORY of the file which
python
was started with (so/someplace/on/disk/> $ python /path/to/the/run.py
means the first path is/path/to/the/
, and likewise the path would be the same if you're in/path/to/> $ python the/run.py
(it is still ALWAYS going to be set to the FULL PATH to the directory no matter if you gave python a relative or absolute file)), or it will be an empty string if python was started without a file aka interactive mode (an empty string means "current working directory for the python process"). In other words, Python assumes that the file you started wants to be able to do relative imports ofpackage/-folders
andblah.py
modules that exist within the same location as the file you started python with. - The other entries in
sys.path
are populated from thePYTHONPATH
environment variable. Basically your global pip folders where your third-party python packages are installed (things likerequests
andnumpy
andtensorflow
).
So, basically: Yes, you can trust that Python will find your local package-folders and module files first, before any globally installed pip stuff.
Here's an example to explain further:
myproject/ # <-- This is not a package (no __init__.py file).
modules/ # <-- This is a package (has an __init__.py file).
__init__.py
foo.py
run.py
second.py
executed with: python /path/to/the/myproject/run.py
will cause sys.path[0] to be "/path/to/the/myproject/"
run.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"
import second # will import "/path/to/the/myproject/" + "second.py"
second.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"
EDIT:
You can run the following command to print a sorted list of all built-in module names. These are the things that load before ANY custom files/module folders in your projects. Basically these are names you must avoid in your own custom files:
python -c "import sys, json; print(json.dumps(sorted(list(sys.modules.keys())), indent=4))"
List as of Python 3.9.0:
"__main__",
"_abc",
"_bootlocale",
"_codecs",
"_collections",
"_collections_abc",
"_frozen_importlib",
"_frozen_importlib_external",
"_functools",
"_heapq",
"_imp",
"_io",
"_json",
"_locale",
"_operator",
"_signal",
"_sitebuiltins",
"_sre",
"_stat",
"_thread",
"_warnings",
"_weakref",
"abc",
"builtins",
"codecs",
"collections",
"copyreg",
"encodings",
"encodings.aliases",
"encodings.cp1252",
"encodings.latin_1",
"encodings.utf_8",
"enum",
"functools",
"genericpath",
"heapq",
"io",
"itertools",
"json",
"json.decoder",
"json.encoder",
"json.scanner",
"keyword",
"marshal",
"nt",
"ntpath",
"operator",
"os",
"os.path",
"pywin32_bootstrap",
"re",
"reprlib",
"site",
"sre_compile",
"sre_constants",
"sre_parse",
"stat",
"sys",
"time",
"types",
"winreg",
"zipimport"
So NEVER use any of those names for you .py
files or your project module subfolders.
Python searches the paths in sys.path
in order (see http://docs.python.org/tutorial/modules.html#the-module-search-path). easy_install changes this list directly (see the last line in your easy-install.pth file):
import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)
This basically takes whatever directories are added and inserts them at the beginning of the list.
Also see Eggs in path before PYTHONPATH environment variable.