dd vs cat -- is dd still relevant these days?
In appearance, dd
is a tool from an IBM operating system that's retained its foreign appearance (its parameter passing), which performs some very rarely-used functions (such as EBCDIC to ASCII conversions or endianness reversal… not a common need nowadays).
I used to think that dd
was faster for copying large blocks of data on the same disk (due to more efficient use of buffering), but this isn't true, at least on today's Linux systems.
I think some of dd
's options are useful when dealing with tapes, where reading is really performed in blocks (tape drivers don't hide the blocks on the storage medium the way disk drivers do). But I don't know the specifics.
One thing dd
can do that can't (easily) be done by any other POSIX tool is taking the first N bytes of a stream. Many systems can do it with head -c 42
, but head -c
, while common, isn't in POSIX (and isn't available today on e.g. OpenBSD). (tail -c
is POSIX.) Also, even where head -c
exists, it might read too many bytes from the source (because it uses stdio buffering internally), which is a problem if you're reading from a special file where just reading has an effect. (Current GNU coreutils read the exact count with head -c
, but FreeBSD and NetBSD use stdio.)
More generally, dd
gives an interface to the underlying file API that is unique amongst Unix tools: only dd
can overwrite or truncate a file at any point or seek in a file. (This is dd
's unique ability, and it's a big one; oddly enough dd
is best known for things that other tools can do.)
- Most Unix tools overwrite their output file, i.e. erase its contents and start it over from scratch. This is what happens when you use
>
redirection in the shell as well. - You can append to a file's contents with
>>
redirection in the shell, or withtee -a
. If you want to shorten a file by removing all data after a certain point, this is supported by the underlying kernel and C API through the
truncate
function, but not exposed by any command line tool exceptdd
:dd if=/dev/null of=/file/to/truncate seek=1 bs=123456 # truncate file to 123456 bytes
If you want to overwrite data in the middle of a file, again, this is possible in the underyling API by opening the file for writing without truncating (and calling
lseek
to move to the desired position if necessary), but onlydd
can open a file without truncating or appending, or seek from the shell (more complex example).# zero out the second kB block in the file (i.e. bytes 1024 to 2047) dd if=/dev/zero of=/path/to/file bs=1024 seek=1 count=1 conv=notrunc
So… As a system tool, dd
is pretty much useless. As a text (or binary file) processing tool, it's quite valuable!
No one has yet mentioned that you can use dd to create sparse files, though truncate
can also be used for the same purpose.
dd if=/dev/zero of=sparse-file bs=1 count=1 seek=10GB
This is almost instant and creates an arbitrary large file that can be used as a loopback file for instance:
loop=`losetup --show -f sparse-file`
mkfs.ext4 $loop
mkdir myloop
mount $loop myloop
The nice thing is that it initially only uses a single block of diskspace, and thereafter grows only as needed (ext4 formatting of a 10GB file consumes 291 MB on my system). Use du
to see how much disk space is actually used -- ls
reports only the maximum size the file may grow to.
The dd
command includes LOTS of options that cat is not able to accommodate. Perhaps in your usage cases cat is a workable substitute, but it is not a dd replacement.
One example would be using dd
to copy part of something but not the whole thing. Perhaps you want to rip out some of the bits from the middle of an iso image or the partition table from a hard drive based on a known location on the device. With dd
you can specify the start, stop and quantity options that allow these actions.
These options of dd
make it indispensable for fine grained data manipulation whereas cat
* can only operate on whole file objects, devices or streams.
*As noted by Gilles in the comments, it is possible to combine cat
with other tools to isolate parts of something, but cat
still operates on the whole object.