Why do you need a base image with Docker?
An image is just a snapshot of file system and dependencies or a specific set of directories of a particular application/software. By snapshot I mean, a copy of just those files which are required to run that piece of software (for example mysql, redis etc.) with basic configurations in a container environment. When you create a container using an image, a small section of resources from your system are isolated with the help of namespacing and cgroups, and then the files inside the image are copied in this isolated environment of resources.
Let us understand what is a base image:
A base image is a starting point or an initial step for the image that we finally want to create.
Suppose you want an image that runs redis (this is a silly example and you can achieve it another way, but just for the sake of explanation think you will not find that image on docker hub) You would need a starting point to create the image for that. So let us take Alpine image as a base image. Alpine is the lightest image that contains files just to run basic commands(for example: ls, cd, apk add inside the container).
Create a Dockerfile with following commands:
FROM alpine
RUN apk add --update redis
CMD ["redis-server"]
Now when you run
docker build .
command, it gives the following output:
Sending build context to Docker daemon 2.048kB
Step 1/3 : FROM alpine
---> a24bb4013296
Step 2/3 : RUN apk add --update redis
---> Running in 535bfd2d1ff1
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-
cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/1) Installing redis (5.0.9-r0)
Executing redis-5.0.9-r0.pre-install
Executing redis-5.0.9-r0.post-install
Executing busybox-1.31.1-r16.trigger
OK: 7 MiB in 15 packages
Removing intermediate container 535bfd2d1ff1
---> 4c288890433b
Step 3/3 : CMD ["redis-server"]
---> Running in 7f01a4da3209
Removing intermediate container 7f01a4da3209
---> fc26d7967402
Successfully built fc26d7967402
This output shows that in Step 1/3 it takes the base alpine image, in Step 2/3, adds a layer of redis to it and then executes the
redis-server
command in Step 3/3 whenever the container is started. TheRUN
command is only executed when the image is is build process.
Further explanation of output is out of the scope of this question.
So when you pull an image from docker hub, it just has the configurations to run the basic requirements. When you need to add your own requirements and configurations to an image, you create a Dockerfile
and add dependencies layer by layer on a base image to run it according to your needs.
In simple words I can explain that..as we use certain libraries and node packages for our application in similar way we can use Base Images which are already made and use them with simple search.You can also define your own base image and make use of it.
In fact, Docker works through application of layers that are added to the base image. As you have to maintain coherence between all these layers, you cannot base your first image on a moving target (i.e. your writable file-system). So, you need a read-only image that will stay forever the same.
Here is an excerpt of the documentation of Docker about the images:
Since Docker uses a Union File System, the processes think the whole file system is mounted read-write. But all the changes go to the top-most writable layer, and underneath, the original file in the read-only image is unchanged. Since images don’t change, images do not have state.