regular expression usage in glob.glob for python
The easiest way would be to filter the glob results yourself. Here is how to do it using a simple loop comprehension:
import glob
res = [f for f in glob.glob("*.txt") if "abc" in f or "123" in f or "a1b" in f]
for f in res:
print f
You could also use a regexp and no glob
:
import os
import re
res = [f for f in os.listdir(path) if re.search(r'(abc|123|a1b).*\.txt$', f)]
for f in res:
print f
(By the way, naming a variable list
is a bad idea since list
is a Python type...)
Here is a ready to use way of doing this, based on the other answers. It's not the most performance critical, but it works as described;
def reglob(path, exp, invert=False):
"""glob.glob() style searching which uses regex
:param exp: Regex expression for filename
:param invert: Invert match to non matching files
"""
m = re.compile(exp)
if invert is False:
res = [f for f in os.listdir(path) if m.search(f)]
else:
res = [f for f in os.listdir(path) if not m.search(f)]
res = map(lambda x: "%s/%s" % ( path, x, ), res)
return res
I'm surprised that no answers here used filter.
import os
import re
def glob_re(pattern, strings):
return filter(re.compile(pattern).match, strings)
filenames = glob_re(r'.*(abc|123|a1b).*\.txt', os.listdir())
This accepts any iterator that returns strings, including lists, tuples, dicts(if all keys are strings), etc. If you want to support partial matches, you could change .match
to .search
. Please note that this obviously returns a generator, so if you want to use the results without iterating over them, you could convert the result to a list yourself, or wrap the return statement with list(...).