How do you diff a directory for only files of a specific type?

You can also use find with -exec to call diff:

cd /destination/dir/1
find . -name *.xml -exec diff {} /destination/dir/2/{} \;

The lack of a complementary --include ... .

We can do one workaround, a exclude file with all files but what we want include. So we create file1 with a find all files which don't have extensions that we want include, sed catch the filename and is just :

diff --exclude-from=file1  PATH1/ PATH2/

For example:

find  PATH1/ -type f | grep --text -vP "php$|html$" | sed 's/.*\///' | sort -u > file1 
diff PATH1/ PATH2/ -rq -X file1 

Taken from ( a version of) the man page:

-x PAT  --exclude=PAT
  Exclude files that match PAT.

-X FILE    --exclude-from=FILE
  Exclude files that match any pattern in FILE.

So it looks like -x only accepts one pattern as you report but if you put all the patterns you want to exclude in a file (presumably one per line) you could use the second flag like so:

$ diff /destination/dir/1 /destination/dir/2 -r -X exclude.pats

where exclude.pats is:

*.jpg
*.JPG
*.xml
*.XML
*.png
*.gif

You can specify -x more than once.

diff -x '*.foo' -x '*.bar' -x '*.baz' /destination/dir/1 /destination/dir/2

From the Comparing Directories section of info diff (on my system, I have to do info -f /usr/share/info/diff.info.gz):

To ignore some files while comparing directories, use the '-x PATTERN' or '--exclude=PATTERN' option. This option ignores any files or subdirectories whose base names match the shell pattern PATTERN. Unlike in the shell, a period at the start of the base of a file name matches a wildcard at the start of a pattern. You should enclose PATTERN in quotes so that the shell does not expand it. For example, the option -x '*.[ao]' ignores any file whose name ends with '.a' or '.o'.

This option accumulates if you specify it more than once. For example, using the options -x 'RCS' -x '*,v' ignores any file or subdirectory whose base name is 'RCS' or ends with ',v'.

Tags:

Linux

Bash