Find directories that DON'T contain a file
Case 1: You know the exact file name to look for
Use find
with test -e your_file
to check if a file exists. For example, you look for directories which have no cover.jpg
in them:
find base_dir -mindepth 2 -maxdepth 2 -type d '!' -exec test -e "{}/cover.jpg" ';' -print
It's case sensitive though.
Case 2: You want to be more flexible
You're not sure of the case, and the extension might be jPg
, png
...
find base_dir -mindepth 2 -maxdepth 2 -type d '!' -exec sh -c 'ls -1 "{}"|egrep -i -q "^cover\.(jpg|png)$"' ';' -print
Explanation:
- You need to spawn a shell
sh
for each directory since piping isn't possible when usingfind
ls -1 "{}"
outputs just the filenames of the directoryfind
is currently traversingegrep
(instead ofgrep
) uses extended regular expressions;-i
makes the search case insensitive,-q
makes it omit any output"^cover\.(jpg|png)$"
is the search pattern. In this example, it matches e.g.cOver.png
,Cover.JPG
orcover.png
. The.
must be escaped otherwise it means that it matches any character.^
marks the start of the line,$
its end
Other search pattern examples for egrep:
Substitute the egrep -i -q "^cover\.(jpg|png)$"
part with:
egrep -i -q "cover\.(jpg|png)$"
: Also matchescd_cover.png
,album_cover.JPG
...egrep -q "^cover\.(jpg|png)$"
: Matchescover.png
,cover.jpg
, but NOTCover.jpg
(case sensitivity is not turned off)egrep -iq "^(cover|front)\.jpg$"
: matches e.g.front.jpg
,Cover.JPG
but notCover.PNG
For more info on this, check out Regular Expressions.
Simple, it transpires. The following gets a list of directories with the cover and compares that with a list of all the second-level directories. Lines that appear in both "files" are suppressed, leaving a list of directories that need covers.
comm -3 \
<(find ~/Music/ -iname 'cover.*' -printf '%h\n' | sort -u) \
<(find ~/Music/ -maxdepth 2 -mindepth 2 -type d | sort) \
| sed 's/^.*Music\///'
Hooray.
Notes:
comm
's arguments are as follows:-1
suppress lines unique to file1-2
suppress lines unique to file2-3
suppress lines that appear in both files
comm
only takes files, hence the kooky<(...)
input method. This pipes the content via a real [temporary] file.comm
needs sorted input or it doesn't work andfind
does by no means guarantee an order. It also needs to be unique. The firstfind
operation could find multiple files forcover.*
so there could be duplicate entries.sort -u
quickly ruffles those down to one. The second find is always going to be unique.dirname
is a handy tool for getting a file's dir without resorting tosed
(et al).find
andcomm
are both a bit messy with their output. The finalsed
is there to clean things up so you're left withArtist/Album
. This may or may not be desirable for you.
This is much nicer to solve with globbing than with find.
$ cd ... # to the directory one level above the album/artist structure
$ echo */*/*.cover # lists all the covers
$ printf "%s\n" */*/*.cover # lists all the covers, one per line
Now suppose you have no stray files in this nice structure. The current directory contains only artist subdirectories, and those contain only album subdirectories. Then we can do something like this:
$ diff <(for x in */*/cover.jpg; do echo "$(dirname "$x")" ; done) <(printf "%s\n" */*)
The <(...)
syntax is Bash process substitution: it lets you use a command in place of a file argument. It lets you treat the output of a command as a file. So we can run two programs, and take their diff, without saving their output in temporary files. The diff
program thinks it is working with two files, but in fact it's reading from two pipes.
The command that produces the right hand input to diff
, printf "%s\n" */*
, just lists the album directories. The left hand command iterates through the *.cover
paths and prints their directory names.
Test run:
$ find . # let's see what we have here
.
./a
./a/b
./foo
./foo/bar
./foo/baz
./foo/baz/cover.jpg
$ diff <(for x in */*/cover.jpg; do echo "$(dirname "$x")" ; done) <(printf "%s\n" */*)
0a1,2
> a/b
> foo/bar
Aha, the a/b
and foo/bar
directories have no cover.jpg
.
There are some broken corner cases, like that by default *
expands to itself if it matches nothing. This can be addressed with Bash's set -o nullglob
.