Moving files in a local network - would compressing improve speed?
Aside of the type of the file, this is especially dependent of the amount of files. While transferring bulk data is theoretically possible at network speed, there's a lot of overhead associated with file system operations such as enumerating files and properties, creating them and deleting them.
If you have a large amount of small files, the overhead can even get larger than the data to be transmitted.
In such cases, archiving the data before transmission can be a huge benefit. If it's badly compressible data (encrypted and/or already compressed data), I recommend to not compress the archive to save a lot of time - just use tar.
If the files are compressible (uncompressed bitmaps, text), also compressing might make sense.
It largely depends on the type of files you are moving.
- If your files are like PDF, JPEG movies, Installation files, etc,
they are likely to be already compressed and will not give you a great advantage. - If its source files compressing will be quite useful.
- If its lots of small files, at least a
tar
archive will be useful.
Finally, if your source machine has lots of processing power and memory,
compression would be in useful time -- else just a tar
(based on above points) would suffice.
Since your network is just 100 Mbps, you should lean towards compression if that helps.
But, if you are transferring files that cannot be compressed much,
you should start accounting for the transfer time
Alternatively, you could consider other mediums for transfer (like USB/DVD).
Probably the fastest technique is taring the data up, running it through a pipe, and then untaring at the other end.
Something like this
$ tar -czf - root_dir | ssh -c blowfish remote_machine (cd parent_dir ; tar -xzf -)
The -z flag tells tar to compress, which should be very similar to a separate gzip step, which you include separately if you want.
If you need to copy or synchronize data a subsequent time, you can use rsync (-z gives compression). In particular, if the above command is interrupted, rsync will confirm your data, and send anything you missed.
It will be much cleaner if ssh is not asking you for passwords, but I think it will work even with passwords.