Reformatting a large number of XML files
This can be done from find
directly using -exec
:
find . -name "*.xml" -type f -exec xmllint --output '{}' --format '{}' \;
What's passed to -exec
will be invoked once per file found with the template parameters {}
being replaced with the current file name. The \;
on the end of the find command just terminates the line.
The use of xargs
isn't really necessary in this case because we need to invoke xmllint
once per file as both the input and output file names must be specified within the same call.
xargs
would be needed if the command being piped to from find was working on multiple files at a time and that list was long. You can't do that in this case, as you need to pass the single filename to the --output
option of xmllint
. Without xargs
you could end up with a "Argument List too long" error if you are processing a lot of files. xargs
also supports file replace strings with the -I
option:
find . -name "*.xml" -type f | xargs -I'{}' xmllint --output '{}' --format '{}'
Would do the same as the find -exec
command above. If any of your folders have odd chars in like spaces you will need to use the -0
options of find
and xargs
. But using xargs
with -I
implies the option -L 1
which means only process 1 file at a time anyway, so you may as well directly use find
with -exec
.
I typically attack these problems with a layer of indirection. Write a shell script that does what you want, and call that. I'd suggest as a start
#! /bin/sh
for file
do
xmllint --format $file > $file.tmp && mv $file.tmp $file
done
The try it out on a file or two by hand, then you can replace it in the xargs
find . -name "*.xml" -type f | xargs -- xmltidy.sh