How to find duplicate filenames (recursively) in a given directory? BASH
Here is another solution (based on the suggestion by @jim-mcnamara) without awk:
Solution 1
#!/bin/sh
dirname=/path/to/directory
find $dirname -type f | sed 's_.*/__' | sort| uniq -d|
while read fileName
do
find $dirname -type f | grep "$fileName"
done
However, you have to do the same search twice. This can become very slow if you have to search a lot of data. Saving the "find" results in a temporary file might give a better performance.
Solution 2 (with temporary file)
#!/bin/sh
dirname=/path/to/directory
tempfile=myTempfileName
find $dirname -type f > $tempfile
cat $tempfile | sed 's_.*/__' | sort | uniq -d|
while read fileName
do
grep "/$fileName" $tempfile
done
#rm -f $tempfile
Since you might not want to write a temp file on the harddrive in some cases, you can choose the method which fits your needs. Both examples print out the full path of the file.
Bonus question here: Is it possible to save the whole output of the find command as a list to a variable?
Yes this is a really old question. But all those loops and temporary files seem a bit cumbersome.
Here's my 1-line answer:
find /PATH/TO/FILES -type f -printf '%p/ %f\n' | sort -k2 | uniq -f1 --all-repeated=separate
It has its limitations due to uniq
and sort
:
- no whitespace (space, tab) in filename (will be interpreted as new field by
uniq
andsort
) - needs file name printed as last field delimited by space (
uniq
doesn't support comparing only 1 field and is inflexible with field delimiters)
But it is quite flexible regarding its output thanks to find -printf
and works well for me. Also seems to be what @yak tried to achieve originally.
Demonstrating some of the options you have with this:
find /PATH/TO/FILES -type f -printf 'size: %s bytes, modified at: %t, path: %h/, file name: %f\n' | sort -k15 | uniq -f14 --all-repeated=prepend
Also there are options in sort
and uniq
to ignore case (as the topic opener intended to achieve by piping through tr
). Look them up using man uniq
or man sort
.