Why dd takes too long?

dd has many (weird) options, see dd(1).

You should explicitly state the buffer size, so try

dd if=/dev/sda of=/dev/sdb bs=16M

IIRC, the default buffer size is only 512 bytes. The command above sets it to 16 megabytes. You could try something smaller (e.g. bs=1M) but you should use more than the default (especially on recent disk hardware with sectors of 4Kbytes, i.e. Advanced Format). I naively recommend some power of two which is at least a megabyte.

With the default 512 bytes buffer size, I guess (but I could be very wrong) that the hardware requires the kernel to transfer 4K for each 512 bytes block.

Regarding rdsk, the sd(4) man pages say:

At this time, only block devices are provided. Raw devices have not yet been implemented.

Increase of dd's buffer size will give you more performance for read and write operations. Now all disks have hardware read/write buffer. But if you will increase dd's buffer size more than hardware buffer its performance will decrease because dd will read from first disk to buffer when second disk will have written all from its own hardware buffer. You need set bs option of dd command each time different value for different devices.


Years back in Unix-land dd was the required way to copy a block device. That has carried forward as cargo-cult knowledge even though (on Linux-based systems, at least) cat is almost always faster than dd.

However, even back in history a decent block size helped reduce the number of (slow) system calls, given that each system call triggered an I/O operation. The default block size is 512 bytes (one disk sector). Collecting multiple disk blocks together into a single read was - and is - also acceptable. This example uses a 32MB block size:

dd bs=$((512*2048*32)) if=/dev/source of=/dev/target

On current Linux-based systems, though, disks can be most efficiently copied with a simple cat

cat /dev/source >/dev/target

(As noted in the comments on your question pv can be substituted for cat and will give you an indication of progress and throughput.)


Generally, dd can be avoided in favor of some alternatives. There are several good reasons to use GNU ddrescue instead. In Ubuntu, you can install it with:

sudo apt-get install gddrescue

and just plain ddrescue to use. Note that differently from the package name, the executable does not have the initial g.

Using it is as simple as:

ddrescue inputFile outputFile logFile

The log file (named whatever you choose) lets you pause/stop and restart, without redoing the previous work, which is useful when doing large clones or recovery of disks. By default, it displays progress, current copy speed, average copy speed and number of bad blocks found.

It uses sensible defaults for block size, so copy speed is always as fast as the device can handle, in my experience at least (I've cloned many hundreds of drives with it, all sizes and types).

Often times, drives that are starting to fail have speed issues such as occasional patches of slowness, low average speed, sudden long pauses (bad sectors) or complete resets (severe surface errors). ddrescue can help you identify all the above and restart your clone (provided you specified a log file) even if your drive is resetting itself.

Tags:

Dd

Hard Disk