python paths and import order

Even though the above answers regarding the order in which the interpreter scans sys.path are correct, giving precedence to e.g. user file paths over site-packages deployed packages might fail if the full user path is not available in the PYTHONPATH variable.

For example, imagine you have the following structure of namespace packages:

/opt/repo_root
  - project  # this is the base package that brigns structure to the namespace hierarchy
  - my_pkg
  - my_pkg-core
  - my_pkg-gui
  - my_pkg-helpers
  - my_pkg-helpers-time_sync

The above packages all have the internal needed structure and metadata in order to be deployable by conda, and these are also all installed. Therefore, I can open a python shell and type:

>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)

/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py

will return some path in the python interpreter's site-packages subfolder. If I manually add the package to be imported to PYTHONPATH or even to sys.path, nothing will change.

>>> import os

>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(os.environ['PYTHONPATH'], "/opt/repo_root/my_pkg-helpers-time_sync")

>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)

/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py

still returns that the package has been imported from site-packages. You need to include the whole hierarchy of paths into PYTHONPATH, as if it was a traditional python package, and then it will work as you expect:

>>> import os

>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(
... os.environ['PYTHONPATH'],
... "/opt/repo_root",
... "/opt/repo_root/project",
... "/opt/repo_root/project/my_pkg",
... "/opt/repo_root/project/my_pkg-helpers",
... "/opt/repo_root/project/my_pkg-helpers-time_sync"
... )

>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)

/opt/project/my_pkg/helpers/time_sync/__init__.py

after importing a module, python first searches from sys.modules list of directories. if it is not found, then it searches from sys.path list of directories. There might be other lists python search for on your operating system

import time , sys
print (sys.modules)
print (sys.path)

output is lists of directories:

{... , ... , .....}
['C:\\Users\\****', 'C:\\****', ....']

time module is imported in accordance with the order of sys.modules and sys.path lists.


This page is a high Google result for "Python import order", so here's a hopefully clearer explanation:

  • https://docs.python.org/library/sys.html#sys.path
  • https://docs.python.org/tutorial/modules.html#the-module-search-path

As both of those pages explain, the import order is:

  1. Built-in python modules. You can see the list in the variable sys.modules.
  2. The sys.path entries.
  3. The installation-dependent default locations.

And as the sys.path doc page explains, it is populated as follows:

  1. The first entry is the FULL PATH TO THE DIRECTORY of the file which python was started with (so /someplace/on/disk/> $ python /path/to/the/run.py means the first path is /path/to/the/, and likewise the path would be the same if you're in /path/to/> $ python the/run.py (it is still ALWAYS going to be set to the FULL PATH to the directory no matter if you gave python a relative or absolute file)), or it will be an empty string if python was started without a file aka interactive mode (an empty string means "current working directory for the python process"). In other words, Python assumes that the file you started wants to be able to do relative imports of package/-folders and blah.py modules that exist within the same location as the file you started python with.
  2. The other entries in sys.path are populated from the PYTHONPATH environment variable. Basically your global pip folders where your third-party python packages are installed (things like requests and numpy and tensorflow).

So, basically: Yes, you can trust that Python will find your local package-folders and module files first, before any globally installed pip stuff.

Here's an example to explain further:

myproject/ # <-- This is not a package (no __init__.py file).
  modules/ # <-- This is a package (has an __init__.py file).
    __init__.py
    foo.py
  run.py
  second.py

executed with: python /path/to/the/myproject/run.py
will cause sys.path[0] to be "/path/to/the/myproject/"

run.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"
import second # will import "/path/to/the/myproject/" + "second.py"

second.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"

EDIT:

You can run the following command to print a sorted list of all built-in module names. These are the things that load before ANY custom files/module folders in your projects. Basically these are names you must avoid in your own custom files:

python -c "import sys, json; print(json.dumps(sorted(list(sys.modules.keys())), indent=4))"

List as of Python 3.9.0:

"__main__",
"_abc",
"_bootlocale",
"_codecs",
"_collections",
"_collections_abc",
"_frozen_importlib",
"_frozen_importlib_external",
"_functools",
"_heapq",
"_imp",
"_io",
"_json",
"_locale",
"_operator",
"_signal",
"_sitebuiltins",
"_sre",
"_stat",
"_thread",
"_warnings",
"_weakref",
"abc",
"builtins",
"codecs",
"collections",
"copyreg",
"encodings",
"encodings.aliases",
"encodings.cp1252",
"encodings.latin_1",
"encodings.utf_8",
"enum",
"functools",
"genericpath",
"heapq",
"io",
"itertools",
"json",
"json.decoder",
"json.encoder",
"json.scanner",
"keyword",
"marshal",
"nt",
"ntpath",
"operator",
"os",
"os.path",
"pywin32_bootstrap",
"re",
"reprlib",
"site",
"sre_compile",
"sre_constants",
"sre_parse",
"stat",
"sys",
"time",
"types",
"winreg",
"zipimport"

So NEVER use any of those names for you .py files or your project module subfolders.


Python searches the paths in sys.path in order (see http://docs.python.org/tutorial/modules.html#the-module-search-path). easy_install changes this list directly (see the last line in your easy-install.pth file):

import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)

This basically takes whatever directories are added and inserts them at the beginning of the list.

Also see Eggs in path before PYTHONPATH environment variable.

Tags:

Python