Clean way to get the "true" stem of a Path object?
Here's another possible solution to the given problem:
from pathlib import Path
if __name__ == '__main__':
dataset = [
('a', 'a'),
('a.txt', 'a'),
('archive.tar.gz', 'archive'),
('directory/file', 'file'),
('d.x.y.z/f.a.b.c', 'f'),
('logs/date.log.txt', 'date'),
]
for path, stem in dataset:
path = Path(path)
assert path.name.replace("".join(path.suffixes), "") == stem
Why not go recursively?
from pathlib import Path
def true_stem(path):
stem = Path(path).stem
return stem if stem == path else true_stem(stem)
assert(true_stem('d.x.y.z/f.a.b.c') == 'f')
How about a while loop method, where you keep taking .stem
until the path has no suffixes remaining , Example -
from pathlib import Path
example_path = Path("August 08 2015, 01'37'30.log.txt")
example_path_stem = example_path.stem
while example_path.suffixes:
example_path_stem = example_path.stem
example_path = Path(example_path_stem)
Please note, the while loop exits the loop when example_path.suffixes
returns an empty list (As empty list are False like in boolean context) .
Example/Demo -
>>> from pathlib import Path
>>> example_path = Path("August 08 2015, 01'37'30.log.txt")
>>> example_path_stem = example_path.stem
>>> while example_path.suffixes:
... example_path_stem = example_path.stem
... example_path = Path(example_path_stem)
...
>>> example_path_stem
"August 08 2015, 01'37'30"
For your second input - no_suffix
-
>>> example_path = Path("no_suffix")
>>> example_path_stem = example_path.stem
>>> while example_path.suffixes:
... example_path_stem = example_path.stem
... example_path = Path(example_path_stem)
...
>>> example_path_stem
'no_suffix'
You could just .split
it:
>>> Path('logs/date.log.txt').stem.split('.')[0]
'date'
os.path
works just as well:
>>> os.path.basename('logs/date.log.txt').split('.')[0]
'date'
It passes all of the tests:
In [11]: all(Path(k).stem.split('.')[0] == v for k, v in {
....: 'a': 'a',
....: 'a.txt': 'a',
....: 'archive.tar.gz': 'archive',
....: 'directory/file': 'file',
....: 'd.x.y.z/f.a.b.c': 'f',
....: 'logs/date.log.txt': 'date'
....: }.items())
Out[11]: True