Copy a file in a sane, safe and efficient way
With C++17 the standard way to copy a file will be including the <filesystem>
header and using:
bool copy_file( const std::filesystem::path& from,
const std::filesystem::path& to);
bool copy_file( const std::filesystem::path& from,
const std::filesystem::path& to,
std::filesystem::copy_options options);
The first form is equivalent to the second one with copy_options::none
used as options (see also copy_file
).
The filesystem
library was originally developed as boost.filesystem
and finally merged to ISO C++ as of C++17.
I want to make the very important note that the LINUX method using sendfile() has a major problem in that it can not copy files more than 2GB in size! I had implemented it following this question and was hitting problems because I was using it to copy HDF5 files that were many GB in size.
http://man7.org/linux/man-pages/man2/sendfile.2.html
sendfile() will transfer at most 0x7ffff000 (2,147,479,552) bytes, returning the number of bytes actually transferred. (This is true on both 32-bit and 64-bit systems.)
Too many!
The "ANSI C" way buffer is redundant, since a FILE
is already buffered. (The size of this internal buffer is what BUFSIZ
actually defines.)
The "OWN-BUFFER-C++-WAY" will be slow as it goes through fstream
, which does a lot of virtual dispatching, and again maintains internal buffers or each stream object. (The "COPY-ALGORITHM-C++-WAY" does not suffer this, as the streambuf_iterator
class bypasses the stream layer.)
I prefer the "COPY-ALGORITHM-C++-WAY", but without constructing an fstream
, just create bare std::filebuf
instances when no actual formatting is needed.
For raw performance, you can't beat POSIX file descriptors. It's ugly but portable and fast on any platform.
The Linux way appears to be incredibly fast — perhaps the OS let the function return before I/O was finished? In any case, that's not portable enough for many applications.
EDIT: Ah, "native Linux" may be improving performance by interleaving reads and writes with asynchronous I/O. Letting commands pile up can help the disk driver decide when is best to seek. You might try Boost Asio or pthreads for comparison. As for "can't beat POSIX file descriptors"… well that's true if you're doing anything with the data, not just blindly copying.
Copy a file in a sane way:
#include <fstream>
int main()
{
std::ifstream src("from.ogv", std::ios::binary);
std::ofstream dst("to.ogv", std::ios::binary);
dst << src.rdbuf();
}
This is so simple and intuitive to read it is worth the extra cost. If we were doing it a lot, better to fall back on OS calls to the file system. I am sure boost
has a copy file method in its filesystem class.
There is a C method for interacting with the file system:
#include <copyfile.h>
int
copyfile(const char *from, const char *to, copyfile_state_t state, copyfile_flags_t flags);