Find directories with lots of files in
Check /lost+found
in case there was a disk problem and a lot of junk ended up being detected as separate files, possibly wrongly.
Check iostat
to see if some application is still producing files like crazy.
find / -xdev -type d -size +100k
will tell you if there's a directory that uses more than 100kB of disk space. That would be a directory that contains a lot of files, or contained a lot of files in the past. You may want to adjust the size figure.
I don't think there's a combination of options to GNU du
to make it count 1 per directory entry. You can do this by producing the list of files with find
and doing a little bit of counting in awk. Here is a du
for inodes. Minimally tested, doesn't try to cope with file names containing newlines.
#!/bin/sh
find "$@" -xdev -depth | awk '{
depth = $0; gsub(/[^\/]/, "", depth); depth = length(depth);
if (depth < previous_depth) {
# A non-empty directory: its predecessor was one of its files
total[depth] += total[previous_depth];
print total[previous_depth] + 1, $0;
total[previous_depth] = 0;
}
++total[depth];
previous_depth = depth;
}
END { print total[0], "total"; }'
Usage: du-inodes /
. Prints a list of non-empty directories with the total count of entries in them and their subdirectories recursively. Redirect the output to a file and review it at your leisure. sort -k1nr <root.du-inodes | head
will tell you the biggest offenders.
You can check with this script:
#!/bin/bash
if [ $# -ne 1 ];then
echo "Usage: `basename $0` DIRECTORY"
exit 1
fi
echo "Wait a moment if you want a good top of the bushy folders..."
find "$@" -type d -print0 2>/dev/null | while IFS= read -r -d '' file; do
echo -e `ls -A "$file" 2>/dev/null | wc -l` "files in:\t $file"
done | sort -nr | head | awk '{print NR".", "\t", $0}'
exit 0
This prints the top 10 subdirectories by file count. If you want a top x, change head
with head -n x
, where x
is a natural number bigger than 0.
For 100% sure results, run this script with root privileges:
Often faster than find, if your locate database is up to date:
# locate '' | sed 's|/[^/]*$|/|g' | sort | uniq -c | sort -n | tee filesperdirectory.txt | tail
This dumps the entire locate database, strips off everything past the last '/' in the path, then the sort and "uniq -c" get you the number of files/directories per directory. "sort -n" piped to tail to get you the ten directories with the most things in them.