How to compare two directories and delete duplicate files
Using fdupes
:
fdupes --delete dir1 dir2
fdupes
will not test on filename or file type, but will test on file size and contents (which implicitly includes file type).
Example:
$ mkdir dir1 dir2
$ touch dir{1,2}/{a,b,c}
$ tree
.
|-- dir1
| |-- a
| |-- b
| `-- c
`-- dir2
|-- a
|-- b
`-- c
2 directories, 6 files
$ fdupes --delete dir1 dir2
[1] dir1/a
[2] dir1/b
[3] dir1/c
[4] dir2/a
[5] dir2/b
[6] dir2/c
Set 1 of 1, preserve files [1 - 6, all]: 1
[+] dir1/a
[-] dir1/b
[-] dir1/c
[-] dir2/a
[-] dir2/b
[-] dir2/c
$ tree
.
|-- dir1
| `-- a
`-- dir2
2 directories, 1 file
I have taken example of 2 directories p1 and p2
First i will save the output of p1 and p2 directories filenames to 2 output files
find /root/p1 -type f |awk -F "/" '{print $NF}' > /var/tmp/P1_file.txt
find /root/p2 -type f |awk -F "/" '{print $NF}' > /var/tmp/P2_file.txt
Now i will find the common filenames in both directories and delete in one of directories. I wish you delete the duplicate files in /root/p1 and keep the files in /root/p2
awk 'NR==FNR {a[$1];next}($1 in a) {print $1}' /var/tmp/P1_file.txt /var/tmp/P2_file.txt |awk '{print "rm -rvf" " " "/root/p1/"$1}' | sh
Tested and worked fine
I suggest you to use dircmp
which exists on many Unixes.
See:
man dircmp
The -d
option seems to be the one you might find the most appropriate:
dircmp -d dir1 dir2
will compare contents of dir1
and dir2
and display a diff
like output.