How can I diff two XML files?
One approach would be to first turn both XML files into Canonical XML, and compare the results using diff
. For example, xmllint can be used to canonicalize XML.
$ xmllint --c14n one.xml > 1.xml
$ xmllint --c14n two.xml > 2.xml
$ diff 1.xml 2.xml
Or as a one-liner.
$ diff <(xmllint --c14n one.xml) <(xmllint --c14n two.xml)
Jukka's answer did not work for me, but it did point to Canonical XML. Neither --c14n nor --c14n11 sorted the attributes, but i did find the --exc-c14n switch did sort the attributes. --exc-c14n is not listed in the man page, but described on the command line as "W3C exclusive canonical format".
$ xmllint --exc-c14n one.xml > 1.xml
$ xmllint --exc-c14n two.xml > 2.xml
$ diff 1.xml 2.xml
$ xmllint | grep c14
--c14n : save in W3C canonical format v1.0 (with comments)
--c14n11 : save in W3C canonical format v1.1 (with comments)
--exc-c14n : save in W3C exclusive canonical format (with comments)
$ rpm -qf /usr/bin/xmllint
libxml2-2.7.6-14.el6.x86_64
libxml2-2.7.6-14.el6.i686
$ cat /etc/system-release
CentOS release 6.5 (Final)
Warning --exc-c14n strips out the xml header whereas the --c14n prepends the xml header if not there.
Tried to use @Jukka Matilainen's answer but had problems with white-space (one of the files was a huge one-liner).
Using --format
helps to skip white-space differences.
xmllint --format one.xml > 1.xml
xmllint --format two.xml > 2.xml
diff 1.xml 2.xml
Note: Use vimdiff
command for side-by-side comparison of the xmls.