How to circumvent the fallacy of Python's os.path.commonprefix?
It seems that this issue has been corrected in recent versions of Python. New in version 3.5 is the function os.path.commonpath()
, which returns the common path instead of the common string prefix.
Awhile ago I ran into this where os.path.commonprefix
is a string prefix and not a path prefix as would be expected. So I wrote the following:
def commonprefix(l):
# this unlike the os.path.commonprefix version
# always returns path prefixes as it compares
# path component wise
cp = []
ls = [p.split('/') for p in l]
ml = min( len(p) for p in ls )
for i in range(ml):
s = set( p[i] for p in ls )
if len(s) != 1:
break
cp.append(s.pop())
return '/'.join(cp)
it could be made more portable by replacing '/'
with os.path.sep
.
Assuming you want the common directory path, one way is to:
- Use only directory paths as input. If your input value is a file name, call
os.path.dirname(filename)
to get its directory path. - "Normalize" all the paths so that they are relative to the same thing and don't include double separators. The easiest way to do this is by calling
os.path.abspath( )
to get the path relative to the root. (You might also want to useos.path.realpath( )
to remove symbolic links.) - Add a final separator (found portably with
os.path.sep
oros.sep
) to the end of all the normalized directory paths. - Call
os.path.dirname( )
on the result ofos.path.commonprefix( )
.
In code (without removing symbolic links):
def common_path(directories):
norm_paths = [os.path.abspath(p) + os.path.sep for p in directories]
return os.path.dirname(os.path.commonprefix(norm_paths))
def common_path_of_filenames(filenames):
return common_path([os.path.dirname(f) for f in filenames])