Find directories that DON'T contain a file

Case 1: You know the exact file name to look for

Use find with test -e your_file to check if a file exists. For example, you look for directories which have no cover.jpg in them:

find base_dir -mindepth 2 -maxdepth 2 -type d '!' -exec test -e "{}/cover.jpg" ';' -print

It's case sensitive though.

Case 2: You want to be more flexible

You're not sure of the case, and the extension might be jPg, png...

find base_dir -mindepth 2 -maxdepth 2 -type d '!' -exec sh -c 'ls -1 "{}"|egrep -i -q "^cover\.(jpg|png)$"' ';' -print

Explanation:

  • You need to spawn a shell sh for each directory since piping isn't possible when using find
  • ls -1 "{}" outputs just the filenames of the directory find is currently traversing
  • egrep (instead of grep) uses extended regular expressions; -i makes the search case insensitive, -q makes it omit any output
  • "^cover\.(jpg|png)$" is the search pattern. In this example, it matches e.g. cOver.png, Cover.JPG or cover.png. The . must be escaped otherwise it means that it matches any character. ^ marks the start of the line, $ its end

Other search pattern examples for egrep:

Substitute the egrep -i -q "^cover\.(jpg|png)$" part with:

  • egrep -i -q "cover\.(jpg|png)$" : Also matches cd_cover.png, album_cover.JPG ...
  • egrep -q "^cover\.(jpg|png)$" : Matches cover.png, cover.jpg, but NOT Cover.jpg (case sensitivity is not turned off)
  • egrep -iq "^(cover|front)\.jpg$" : matches e.g. front.jpg, Cover.JPG but not Cover.PNG

For more info on this, check out Regular Expressions.


Simple, it transpires. The following gets a list of directories with the cover and compares that with a list of all the second-level directories. Lines that appear in both "files" are suppressed, leaving a list of directories that need covers.

comm -3 \
    <(find ~/Music/ -iname 'cover.*' -printf '%h\n' | sort -u) \
    <(find ~/Music/ -maxdepth 2 -mindepth 2 -type d | sort) \
| sed 's/^.*Music\///'

Hooray.

Notes:

  • comm's arguments are as follows:

    • -1 suppress lines unique to file1
    • -2 suppress lines unique to file2
    • -3 suppress lines that appear in both files
  • comm only takes files, hence the kooky <(...) input method. This pipes the content via a real [temporary] file.

  • comm needs sorted input or it doesn't work and find does by no means guarantee an order. It also needs to be unique. The first find operation could find multiple files for cover.* so there could be duplicate entries. sort -u quickly ruffles those down to one. The second find is always going to be unique.

  • dirname is a handy tool for getting a file's dir without resorting to sed (et al).

  • find and comm are both a bit messy with their output. The final sed is there to clean things up so you're left with Artist/Album. This may or may not be desirable for you.


This is much nicer to solve with globbing than with find.

$ cd ... # to the directory one level above the album/artist structure

$ echo */*/*.cover   # lists all the covers

$ printf "%s\n" */*/*.cover # lists all the covers, one per line

Now suppose you have no stray files in this nice structure. The current directory contains only artist subdirectories, and those contain only album subdirectories. Then we can do something like this:

$ diff  <(for x in */*/cover.jpg; do echo "$(dirname "$x")" ; done) <(printf "%s\n" */*)

The <(...) syntax is Bash process substitution: it lets you use a command in place of a file argument. It lets you treat the output of a command as a file. So we can run two programs, and take their diff, without saving their output in temporary files. The diff program thinks it is working with two files, but in fact it's reading from two pipes.

The command that produces the right hand input to diff, printf "%s\n" */*, just lists the album directories. The left hand command iterates through the *.cover paths and prints their directory names.

Test run:

$ find .   # let's see what we have here
.
./a
./a/b
./foo
./foo/bar
./foo/baz
./foo/baz/cover.jpg

$ diff  <(for x in */*/cover.jpg; do echo "$(dirname "$x")" ; done) <(printf "%s\n" */*)
0a1,2
> a/b
> foo/bar

Aha, the a/b and foo/bar directories have no cover.jpg.

There are some broken corner cases, like that by default * expands to itself if it matches nothing. This can be addressed with Bash's set -o nullglob.

Tags:

Bash

Find