cp vs. cat to copy a file
One more issue comes to my mind where cat
vs. cp
makes a significant difference:
By definition, cat will expand sparse files, filling in the gaps with "real" zero bytes, while cp at least can be told to preserve the holes.
Sparse files are files where sequences of zero bytes have been replaced by metadata to preserve space. You can test by creating one with dd, and duplicate it with the tools of your choice.
Create a sparse file (changing to /tmp beforehand to avoid trouble - see final note):
15> cd /tmp 16> dd if=/dev/null of=sparsetest bs=512b seek=5 0+0 records in 0+0 records out 0 bytes (0 B) copied, 5.9256e-05 s, 0.0 kB/s
size it - it should not take any space.
17> du -sh sparsetest 0 sparsetest
copy it with cp and check size
18> cp sparsetest sparsecp 19> du -sh sparsecp 0 sparsecp
now copy it with cat and check size
20> cat sparsetest > sparsecat 21> du -sh sparsecat 1.3M sparsecat
try your preferred tools to check on their behaviour
don't forget to clean up.
Final note of caution: Experiments like these have the inherent chance of rising your fame with your local sysadmin if you're doing them on a filesystem that's part of his backup plan, or critical for the well-being of the system. Depending on his choice of tool for backup, he might end up needing more tape media than he ever considered possible to back up that one 0-byte file which gets expanded to terabytes of zeroes.
Other files which cannot be copied with neither cat nor cp would include device-special files, etc. It depends on your implementation of copying tool if it is able to duplicate the device node, or if it would merrily copy its contents instead.
According to Keith's comment, cp
preserves some permissions, and cat
creates the new file as umask
indicates. So $2
's permission is not preserved that $4/vmlinuz
is pretty clean, while if some strange permission is set on $3
, $4/System.map
will keep that.
Both have equivalent functionality in those two cases, but cp is purely a file operation. "Take this file and make a copy of it over there".
cat, on the other hand, is intended to dump the contents of a file out to the console. "Take this file and display it on the screen" and then have a ninja attack the screen and redirect the output elsewhere.
cp would generally be more efficient, as there's no redirection going only, merely a direct copying of bytes from location A to location B.
cat would be read bytes -> output to console -> intercept output -> redirect to new file
.