How to circumvent the fallacy of Python's os.path.commonprefix?

It seems that this issue has been corrected in recent versions of Python. New in version 3.5 is the function os.path.commonpath(), which returns the common path instead of the common string prefix.


Awhile ago I ran into this where os.path.commonprefix is a string prefix and not a path prefix as would be expected. So I wrote the following:

def commonprefix(l):
    # this unlike the os.path.commonprefix version
    # always returns path prefixes as it compares
    # path component wise
    cp = []
    ls = [p.split('/') for p in l]
    ml = min( len(p) for p in ls )

    for i in range(ml):

        s = set( p[i] for p in ls )         
        if len(s) != 1:
            break

        cp.append(s.pop())

    return '/'.join(cp)

it could be made more portable by replacing '/' with os.path.sep.


Assuming you want the common directory path, one way is to:

  1. Use only directory paths as input. If your input value is a file name, call os.path.dirname(filename) to get its directory path.
  2. "Normalize" all the paths so that they are relative to the same thing and don't include double separators. The easiest way to do this is by calling os.path.abspath( ) to get the path relative to the root. (You might also want to use os.path.realpath( ) to remove symbolic links.)
  3. Add a final separator (found portably with os.path.sep or os.sep) to the end of all the normalized directory paths.
  4. Call os.path.dirname( ) on the result of os.path.commonprefix( ).

In code (without removing symbolic links):

def common_path(directories):
    norm_paths = [os.path.abspath(p) + os.path.sep for p in directories]
    return os.path.dirname(os.path.commonprefix(norm_paths))

def common_path_of_filenames(filenames):
    return common_path([os.path.dirname(f) for f in filenames])