Find a file in python
In Python 3.4 or newer you can use pathlib to do recursive globbing:
>>> import pathlib
>>> sorted(pathlib.Path('.').glob('**/*.py'))
[PosixPath('build/lib/pathlib.py'),
PosixPath('docs/conf.py'),
PosixPath('pathlib.py'),
PosixPath('setup.py'),
PosixPath('test_pathlib.py')]
Reference: https://docs.python.org/3/library/pathlib.html#pathlib.Path.glob
In Python 3.5 or newer you can also do recursive globbing like this:
>>> import glob
>>> glob.glob('**/*.txt', recursive=True)
['2.txt', 'sub/3.txt']
Reference: https://docs.python.org/3/library/glob.html#glob.glob
os.walk is the answer, this will find the first match:
import os
def find(name, path):
for root, dirs, files in os.walk(path):
if name in files:
return os.path.join(root, name)
And this will find all matches:
def find_all(name, path):
result = []
for root, dirs, files in os.walk(path):
if name in files:
result.append(os.path.join(root, name))
return result
And this will match a pattern:
import os, fnmatch
def find(pattern, path):
result = []
for root, dirs, files in os.walk(path):
for name in files:
if fnmatch.fnmatch(name, pattern):
result.append(os.path.join(root, name))
return result
find('*.txt', '/path/to/dir')
I used a version of os.walk
and on a larger directory got times around 3.5 sec. I tried two random solutions with no great improvement, then just did:
paths = [line[2:] for line in subprocess.check_output("find . -iname '*.txt'", shell=True).splitlines()]
While it's POSIX-only, I got 0.25 sec.
From this, I believe it's entirely possible to optimise whole searching a lot in a platform-independent way, but this is where I stopped the research.