os.walk without hidden folders
I realize it wasn't asked in the question, but I had a similar problem where I wanted to exclude both hidden files and files beginning with __
, specifically __pycache__
directories. I landed on this question because I was trying to figure out why my list comprehension was not doing what I expected. I was not modifying the list in place with dirnames[:]
.
I created a list of prefixes I wanted to exclude and modified the dirnames in place like so:
exclude_prefixes = ('__', '.') # exclusion prefixes
for dirpath, dirnames, filenames in os.walk(node):
# exclude all dirs starting with exclude_prefixes
dirnames[:] = [dirname
for dirname in dirnames
if not dirname.startswith(exclude_prefixes)]
My use-case was similar to that of OP, except I wanted to return a count of the total number of sub-directories inside a certain folder. In my case I wanted to omit any sub-directories named .git
(as well as any folders that may be nested inside these .git
folders).
In Python 3.6.7, I found that the accepted answer's approach didn't work -- it counted all .git
folder and their sub-folders. Here's what did work for me:
num_local_subdir = 0
for root, dirs, files in os.walk(local_folder_path):
if '.git' in dirs:
dirs.remove('.git')
num_local_subdir += (len(dirs))
No, there is no option to os.walk()
that'll skip those. You'll need to do so yourself (which is easy enough):
for root, dirs, files in os.walk(path):
files = [f for f in files if not f[0] == '.']
dirs[:] = [d for d in dirs if not d[0] == '.']
# use files and dirs
Note the dirs[:] =
slice assignment; os.walk
recursively traverses the subdirectories listed in dirs
. By replacing the elements of dirs
with those that satisfy a criteria (e.g., directories whose names don't begin with .
), os.walk()
will not visit directories that fail to meet the criteria.
This only works if you keep the topdown
keyword argument to True
, from the documentation of os.walk()
:
When
topdown
isTrue
, the caller can modify the dirnames list in-place (perhaps usingdel
or slice assignment), andwalk()
will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to informwalk()
about directories the caller creates or renames before it resumeswalk()
again.