What is the difference between the 'COPY' and 'ADD' commands in a Dockerfile?
There is some official documentation on that point: Best Practices for Writing Dockerfiles
Because image size matters, using
ADD
to fetch packages from remote URLs is strongly discouraged; you should usecurl
orwget
instead. That way you can delete the files you no longer need after they've been extracted and you won't have to add another layer in your image.
RUN mkdir -p /usr/src/things \
&& curl -SL http://example.com/big.tar.gz \
| tar -xJC /usr/src/things \
&& make -C /usr/src/things all
For other items (files, directories) that do not require
ADD
’s tar auto-extraction capability, you should always useCOPY
.
From Docker docs:
ADD or COPY
Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.
More: Best practices for writing Dockerfiles
You should check the ADD
and COPY
documentation for a more detailed description of their behaviors, but in a nutshell, the major difference is that ADD
can do more than COPY
:
ADD
allows<src>
to be a URL- Referring to comments below, the
ADD
documentation states that:
If is a local tar archive in a recognized compression format (identity, gzip, bzip2 or xz) then it is unpacked as a directory. Resources from remote URLs are not decompressed.
Note that the Best practices for writing Dockerfiles suggests using COPY
where the magic of ADD
is not required. Otherwise, you (since you had to look up this answer) are likely to get surprised someday when you mean to copy keep_this_archive_intact.tar.gz
into your container, but instead, you spray the contents onto your filesystem.
COPY
is
Same as 'ADD', but without the tar and remote URL handling.
Reference straight from the source code.