Group files in some folders
The python script below does the job. Hidden files are stored separately in a folder , as well as files without extension.
Since it might be used for a wider range of purposes, I added a few options:
- You can set extensions you'd like to exclude from the "reorganization". If you simply want to move all, set
exclude = ()
- You can choose what to do with empty folders (
remove_emptyfolders = True
orFalse
) - In case you would like to copy the files instead of moving them, replace the line:
shutil.move(subject, new_dir+"/"+name)
by:
shutil.copy(subject, new_dir+"/"+name)
The script:
#!/usr/bin/env python3
import os
import subprocess
import shutil
# --------------------------------------------------------
reorg_dir = "/path/to/directory_to_reorganize"
exclude = (".jpg") # for example
remove_emptyfolders = True
# ---------------------------------------------------------
for root, dirs, files in os.walk(reorg_dir):
for name in files:
subject = root+"/"+name
if name.startswith("."):
extension = ".hidden_files"
elif not "." in name:
extension = ".without_extension"
else:
extension = name[name.rfind("."):]
if not extension in exclude:
new_dir = reorg_dir+"/"+extension[1:]
if not os.path.exists(new_dir):
os.mkdir(new_dir)
shutil.move(subject, new_dir+"/"+name)
def cleanup():
filelist = []
for root, dirs, files in os.walk(reorg_dir):
for name in files:
filelist.append(root+"/"+name)
directories = [item[0] for item in os.walk(reorg_dir)]
for dr in directories:
matches = [item for item in filelist if dr in item]
if len(matches) == 0:
try:
shutil.rmtree(dr)
except FileNotFoundError:
pass
if remove_emptyfolders == True:
cleanup()
IF there is a risk of unwanted overwriting duplicate files
At the expense of a few extra lines, we can prevent overwriting possible duplicates. With the code below, duplicates will be renamed as:
duplicate_1_filename, duplicate_2_filename
etc.
The script:
#!/usr/bin/env python3
import os
import subprocess
import shutil
# --------------------------------------------------------
reorg_dir = "/path/to/directory_to_reorganize"
exclude = (".jpg") # for example
remove_emptyfolders = True
# ---------------------------------------------------------
for root, dirs, files in os.walk(reorg_dir):
for name in files:
subject = root+"/"+name
if name.startswith("."):
extension = ".hidden_files"
elif not "." in name:
extension = ".without_extension"
else:
extension = name[name.rfind("."):]
if not extension in exclude:
new_dir = reorg_dir+"/"+extension[1:]
if not os.path.exists(new_dir):
os.mkdir(new_dir)
n = 1; name_orig = name
while os.path.exists(new_dir+"/"+name):
name = "duplicate_"+str(n)+"_"+name_orig
n = n+1
newfile = new_dir+"/"+name
shutil.move(subject, newfile)
def cleanup():
filelist = []
for root, dirs, files in os.walk(reorg_dir):
for name in files:
filelist.append(root+"/"+name)
directories = [item[0] for item in os.walk(reorg_dir)]
for dr in directories:
matches = [item for item in filelist if dr in item]
if len(matches) == 0:
try:
shutil.rmtree(dr)
except FileNotFoundError:
pass
if remove_emptyfolders == True:
cleanup()
EDIT
With OP in mind, we all forgot to add an instruction on how to use. Since duplicate questions might (and do) appear, it might be useful nevertheless.
How to use
- Copy either one of the scripts into an empty file, save it as
reorganize.py
In the head section of the script, set the targeted directory (with the files to reorganize):
reorg_dir = "/path/to/directory_to_reorganize"
(use quotes if the directory contains spaces)
possible extensions you'd like to exclude (probably none, like below):
exclude = ()
and if you'd like to remove empty folders afterwards:
remove_emptyfolders = True
Run the script with the command:
python3 /path/to/reorganize.py
NB if you'd like to copy the files instead of move, replace:
shutil.move(subject, new_dir+"/"+name)
by:
shutil.copy(subject, new_dir+"/"+name)
Please try first on a small sample.
You can use find
with a somewhat complex exec
command:
find . -iname '*?.?*' -type f -exec bash -c 'EXT="${0##*.}"; mkdir -p "$PWD/${EXT}_dir"; cp --target-directory="$PWD/${EXT}_dir" "$0"' {} \;
# '*?.?*' requires at least one character before and after the '.',
# so that files like .bashrc and blah. are avoided.
# EXT="${0##*.}" - get the extension
# mkdir -p $PWD/${EXT}_dir - make the folder, ignore if it exists
Replace cp
with echo
for a dry run.
More efficient and tidier would be to save the bash
command in a script (say, at /path/to/the/script.sh
):
#! /bin/bash
for i
do
EXT="${i##*.}"
mkdir -p "$PWD/${EXT}_dir"
mv --target-directory="$PWD/${EXT}_dir" "$i"
done
And then run find
:
find . -iname '*?.?*' -type f -exec /path/to/the/script.sh {} +
This approach is pretty flexible. For example, to use the filename instead of the extension (filename.ext
), we'd use this for EXT
:
NAME="${i##*/}"
EXT="${NAME%.*}"
ls | gawk -F. 'NF>1 {f= $NF "-DIR"; system("mkdir -p " f ";mv " $0 " " f)}'
Calculating the list of extensions (after moving):
ls -d *-DIR
Calculating the list of extensions (before moving):
ls -X | grep -Po '(?<=\.)(\w+)$'| uniq -c | sort -n
(in this last exemple, we are calculating the number of files for each extension and sorting it)