Why does rsync over SSH give me 10x the throughput of SCP?
RSYNC vs SCP
SCP basically does a plain old copy from source to destination locally or across a network using SSH but you may be able to use the -C
switch to enable SSH compression to potentially speed up the copy of data across the network.
RSYNC transfers just the differences between two sets of files across the network connection, using an efficient checksum-search algorithm that automatically optimizes the network connection during a data transfer.
RSYNC
DESCRIPTION
rsync is a program that behaves in much the same way that rcp does, but has many more options and uses the rsync remote-update protocol to greatly speed up file transfers when the destination file is being updated. The rsync remote-update protocol allows rsync to transfer just the dif- ferences between two sets of files across the network connection, using an efficient checksum-search algorithm described in the technical report that accompanies this package.
source
SCP
DESCRIPTION
scp copies files between hosts on a network. It uses ssh(1) for data transfer, and uses the same authentication and provides the same secu‐ rity as ssh(1). scp will ask for passwords or passphrases if they are needed for authentication. File names may contain a user and host specification to indicate that the file is to be copied to/from that host. Local file names can be made explicit using absolute or relative pathnames to avoid scp treat‐ ing file names containing ‘:’ as host specifiers. Copies between two remote hosts are also permitted.
source
Both of the protocols are based on SSH. And SSH itself has some overhead:
SCP is really naive protocol with really naive algorithm for transferring a few of small files. It has a lot of synchronization (RTT - Round Trip Time) and small buffers (basically 2048 B -- source).
Rsync is made for performance and therefore it gives much better results and have more features.
The 10x speedup is specific for your case. If you would transfer files over the whole world over high-latency lanes, you would get much worse performance on the scp
case, but on local network, the performance can be almost the same.
And no, compression (-C
for scp
) will not help. The biggest problems are the latency and buffer size.