Docker allows you to create “images” for your application that contain everything needed for the app to run on another machine. The process of creating this image is called “building,” and during this build process, you need to specify all the files and folders required for your application. You can add these in two ways. In the Dockerfile, “add” vs “copy” is a common choice. The two commands are similar but come with some important distinctions.
Here’s the difference, along with an example of a docker image that uses add and copy to build the image.
Add vs Copy in Docker
While creating an image, you need a file containing all the instructions docker needs to replicate the context in which your app will run. To do this, we need to create a “Dockerfile” that contains all the instructions for recreating the application environment. It’s this file that contains instructions like “add” and “copy”.
The “Copy” Command in Docker
The copy command copies files and directories from the build context to the docker image. It’s a very basic command with no additional functionality. The command looks like this:
COPY app/ /usr/src/app
In the above example, you’re instructing docker via the Dockerfile to copy the contents of the /app directory on your local machine to the /usr/src/app directory inside the image file. The command will copy all directories and subdirectories from /app to the image. If the /usr/src/app folder doesn’t exist in the image hierarchy, it will be created.
The “Add” Command in Docker
The “Add” command is similar to the “copy” command but has a couple of additional features.
First, it’s capable of including and extracting archive files like “.tar” files into the image. So let’s say we have the following command:
ADD app.tar.gz /usr/src/app
This command will decompress all the contents of the “app.tar.gz” file from the local file system, and add them to the /usr/src/app inside the container. This is in contrast to the “copy” command, which will simply copy the archive file and not extract anything.
The second difference between “add” and “copy” in docker, is that you can use the former to include remote resources as well as local files. For example, the following command:
ADD https://example.com/sample-data.csv /data/sample-data.csv
Let’s say you have a project and you want to include some sample data. Furthermore, let’s say you want this data to always be up to date, and that it’s located on some remote server. Using “copy”, there’s no way you can instruct docker to fetch the resource from a remote server. But the above command will download the “sample-data.csv” file from the remote resource and include it in the docker container when it’s run on the target machine. This way, you can ensure that the latest resources are always available on the target host.
These two distinctions of the ADD command make it more powerful than COPY. But does that mean you should always use “add” instead of “copy”? Not necessarily.
Use “Copy” Wherever Possible
You might think that if “ADD” is more powerful than “COPY”, then we might as well use the former instead of the latter all the time. But the opposite is true. Adhering to the design principles of good software development, we should always use the tool that has the least power when choosing between two of them to get the job done. The reason is that if you use “ADD”, then you can accidentally complete actions that you didn’t intend to.
For example, many projects require a tar archive to exist “as is” without extracting their contents. The application will use them in a specific way, and unpacking what’s inside would be a big mistake. If you use “ADD” instead of “COPY” in this scenario, you’ll end up extracting the archive into the target machine, which will spoil everything.
Dangers of Remote Resources
While it might appear that the ability to download remote resources is useful, you should use this with caution.
Danger of the Remote Resource Changing
There are also security concerns while fetching remote resources. Importing external resources carries the risk of malware or a misconfigured file being downloaded. Remember that the Dockerfile isn’t just used once on your system. It’s used every time the image is downloaded and executed on a server, which means you’re at the whims and mercies of whoever maintains those external resources. While it might seem safe at any given moment, you never know when an open-source project might be closed down, or sold to a private corporation who can then modify the resource at any time. Not to mention that the remote server itself might get compromised.
No Integrity Checks
The ADD command in Dockerfile downloads remote resources “as is” with no integrity checks. This means the file content can change at any time and it’ll be downloaded nonetheless.
In addition, depending on the type of connection, there might be man-in-the-middle attacks, which could involve your user downloading an entirely different, malicious file onto their server, instead.
Dependency Drift
Dependency drift can occur when a remote resource introduces new versions, with changed features, APIs, or even different vulnerabilities. Unlike the previous two points, this doesn’t indicate malicious intent, but merely the realities of software development. Unless you specify a version in the ADD command, you’re opening yourself to the danger of the remote resource slowly changing over time, while your image keeps assuming an older version. Eventually, something will break, unless the maintainer takes special care to keep everything the same.
Moreover, debugging a problem caused due to dependency drift can be very hard since it’s difficult to replicate when you don’t know which version of the remote resource was used in the past on a local machine.
Best Practices for Downloading Remote Resources
To avoid these problems, it’s best to adhere to the following best practices.
Only Use Trusted Repositories
To whatever extent possible, only include remote resources using the ADD command from trusted repos that are well-maintained. Even these can be risky, but it’s best to be safe.
Use Checksums to Verify the Download
While the ADD command itself won’t perform any checksum verification, you can implement your checksum verification for a specific file, using curl. For example, the following command:
RUN curl -fsSL https://example.com/resource.tar.gz -o resource.tar.gz && \
echo "expected_checksum resource.tar.gz" | sha256sum -c -
This will download the resource, verify the checksum, compare it to the one you want, and give you full control over whether or not you allow the remote download to proceed. It might require some extra work if you want to extract the tar archive, though. Here’s a comprehensive guide to “tar” files, if you need a reference.
Use Package Managers if Possible
It’s best to use a package manager instead of the “ADD” command, wherever possible, as these are already vetted and ready for use. As an extra precaution, specify the version of the file.
Include the File in the Build Context
The best solution, if possible, is to include the files you need in the build context unless there are constraints such as the size of the image file. This way, you avoid all the security problems.
Conclusion
As you can see, the difference between ADD and COPY is slight, but significant. Particularly from a security standpoint, using ADD instead of COPY can be very dangerous, if not properly implemented. Whenever possible, use COPY, and if you really need to download remote resources, consider using either a package manager or a tool like curl, where you can verify the checksum of the downloaded resource.

I’m a NameHero team member, and an expert on WordPress and web hosting. I’ve been in this industry since 2008. I’ve also developed apps on Android and have written extensive tutorials on managing Linux servers. You can contact me on my website WP-Tweaks.com!
Leave a Reply