regular expression usage in glob.glob for python

The easiest way would be to filter the glob results yourself. Here is how to do it using a simple loop comprehension:

import glob
res = [f for f in glob.glob("*.txt") if "abc" in f or "123" in f or "a1b" in f]
for f in res:
    print f

You could also use a regexp and no glob:

import os
import re
res = [f for f in os.listdir(path) if re.search(r'(abc|123|a1b).*\.txt$', f)]
for f in res:
    print f

(By the way, naming a variable list is a bad idea since list is a Python type...)


Here is a ready to use way of doing this, based on the other answers. It's not the most performance critical, but it works as described;

def reglob(path, exp, invert=False):
    """glob.glob() style searching which uses regex

    :param exp: Regex expression for filename
    :param invert: Invert match to non matching files
    """

    m = re.compile(exp)

    if invert is False:
        res = [f for f in os.listdir(path) if m.search(f)]
    else:
        res = [f for f in os.listdir(path) if not m.search(f)]

    res = map(lambda x: "%s/%s" % ( path, x, ), res)
    return res

I'm surprised that no answers here used filter.

import os
import re

def glob_re(pattern, strings):
    return filter(re.compile(pattern).match, strings)

filenames = glob_re(r'.*(abc|123|a1b).*\.txt', os.listdir())

This accepts any iterator that returns strings, including lists, tuples, dicts(if all keys are strings), etc. If you want to support partial matches, you could change .match to .search. Please note that this obviously returns a generator, so if you want to use the results without iterating over them, you could convert the result to a list yourself, or wrap the return statement with list(...).

Tags:

Python

Glob