How to find out the base image for a docker image

Easy way is to use

docker image history deno

This above command will give you output like this

enter image description here

Then just look at the IMAGE column and take that image ID which a24bb4013296 which is just above the first <missing>

Then just do the

For Linux

docker image ls | grep a24bb4013296

For Windows

docker image ls | findstr a24bb4013296

This will give you the base image name

enter image description here

The information doesn't really exist, exactly. An image will contain the layers of its parent(s) but there's no easy way to reverse layer digests back to a FROM statement, unless you happen to have (or are able to figure out) the image that contains those layers.

If you have the parent image(s) on-hand (or can find them), you can infer which image(s) your image used for its FROM statement (or ancestry) by cross-referencing the layers.

Theoretical example

Suppose your image, FOO, contains the layers 1 2 3 4 5 6. If you have another image, BAR on your system containing layers 1 2 3, you could infer that image BAR is an ancestor of image FOO -- I.E. that FROM BAR would have been used at some point in its hierarchy.

Suppose further that you have another image, BAZ which contains the layers 1 2 3 4 5. You could infer that image BAZ has image BAR in its ancestry and that image FOO inherits from image BAZ (and therefore indirectly from BAR).

From this, information you could infer the dockerfiles for these images might have looked something like this:

# Dockerfile of image BAR
FROM scratch
# layers 1 2 and 3
COPY ./one /
COPY ./two /
COPY ./three /

# Dockerfile of Image BAZ
FROM BAR
RUN echo "this makes layer 4" > /four
RUN echo "this makes layer 5" > /five

# Dockerfile of image FOO
FROM BAZ
RUN echo "this makes layer 6" > /six

You could get the exact commands by looking at docker image history for each image.

One important thing to keep in mind here, however, is that docker tags are mutable; maintainers make new images and move the tags to those images. So if you built an image with FROM python:3.8.1 today, it won't contain the same layers as if you had built an image with that same FROM line a few weeks ago. You'll need the SHA256 digest to be sure you're using the exact same image.

Practical Example, local images

Now that we understand the theory behind identifying images and their bases, let's put it to practice with a real-world example.

Note: because the tags I use will change over time (see above RE: tag mutability), I'll be using the SHA256 digest to pull the images in this example so it can be reproduced by viewers of this answer.

Let's say we have a particular image and we want to find its base(s). We'll use the official maven image here.

First, we'll take a look at its layers.

# maven:3.6-jdk-11-slim at time of writing, on my platform
IMAGE="docker.io/maven@sha256:55f1c145a04e01706233d68fe0b6b20bf76f765ab32f3fe6e29c8ef933917af6"
docker pull $IMAGE
docker image inspect $IMAGE | jq -r '.[].RootFS.Layers[]'

This will output the layers:

sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d
sha256:eda2f4da9b1e70500ac340d40ee039ef3877e8be13b9a24cd345406bf6693412
sha256:6bdb7b3c3e226bdfaa911ba72a95fca13c3979cd150061d570cf569e93037ce6
sha256:ce217e530345060ca0973807a3288560e1e15cf1a4eeec44d6aa594a926c92dc
sha256:f256c980a7d17a00f57fd42a19f6323fcc2341fa46eba128def04824cafa5afa
sha256:446b1af848de2dcb92bbd229ca6ecaabf2f48dab323c19f90d02622e09a8fa67
sha256:10652cf89eaeb5b5d8e0875a6b1867b5cf92c509a9555d3f57d87fab605115a3
sha256:d9a4cf86bf01eb170242ca3b0ce456159fd3fddc9c4d4256208a9d19bae096ca

Now, from here, we can try to find other images that have a (strict) subset of these layers. Assuming you have the images on-hand, you can find them by cross-referencing the layers of images you have on disk, for example, using docker image inspect.

In this case, I just happen to know what these images are and have them on-hand (I'll discuss later what you might do if you don't have the images on-hand) so we'll go ahead and pull those images and take a look at the layers.

If you want to follow along:

# openjdk:11.0.10-jdk-slim at time of writing, on my platform
OPENJDK='docker.io/openjdk@sha256:fe6a46a26ff7d6c31b258e07b3d53f0c42fe68f55f646cc39d60d0b17cbc827b'

# debian:buster-20210329-slim at time of writing on my platform
DEBIAN='docker.io/debian@sha256:088be7d6017ad3ae98325f47707112e1f61687c371be1865e55d5e5531ca97fd'

docker pull $OPENJDK
docker pull $DEBIAN

If we inspect these images and compare them against the layers we saw in the output of docker image inspect for the maven image, we can confirm that the layers from openjdk and debian are present in our original maven image.

$ docker image inspect $DEBIAN | jq -r '.[].RootFS.Layers[]'
sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d

$ docker image inspect $OPENJDK | jq -r '.[].RootFS.Layers[]'
sha256:6e06900bc10223217b4c78081a857866f674c462e4f90593b01894da56df336d
sha256:eda2f4da9b1e70500ac340d40ee039ef3877e8be13b9a24cd345406bf6693412
sha256:6bdb7b3c3e226bdfaa911ba72a95fca13c3979cd150061d570cf569e93037ce6
sha256:ce217e530345060ca0973807a3288560e1e15cf1a4eeec44d6aa594a926c92dc

As stated, because these 5 layers are a strict subset of the 8 layers from the maven image, we can conclude the openjdk and debian images are, at least, both in the ancestry path of the maven image.

We can further infer that the last 3 layers most likely come from the maven image itself (or, potentially, some unknown image).

Caveats, when you don't have images locally

Now, of course the above only works because I happen to have all the images on-hand. So, you'd either need to have the images or be able to locate them by the layer digests.

You might still be able to figure this out using information that may be available from registries like Docker Hub or your own private repositories.

For official images, the docker-library/repo-info contains historical information about the official images, including the layer digests for the various tags cataloged over the last several years. You could use this, for example, as a source of layer information.

If you can imagine this like a database of layer digests, you could infer ancestry of at least these official images.

"Distribution" (remote) digests vs "Content" (local) digests

An important caveat to note is that, when you inspect an image for its layer digests locally, you are getting the content digest of the layers. If you are looking at layer digests in a registry manifest (like what appears in the docker-library/repo-info project) you get the compressed distribution digest and won't be able to compare the layer digests with content.

So you can compare digests local <--> local OR remote <--> remote only.

Example, using remote images

Suppose I want to do this same thing, but I want to associate images in a remote repository and find its base(s). We can do the same thing by looking at the layers in the remote manifest.

You can find references how to do this for your particular registry, as described in this answer for dockerhub.

Using the same images from the example above, we would find that the distribution layer digests also match in the same way.

$ get-remote-layers $IMAGE
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3
sha256:96fde6667c188c81fcddee021ccbb3e054ebe83350fd4609e17a3d37f0ec7f9d
sha256:74d17759dd2a1b51afc740fadd96f655260689a2087308e40d1865a0098c5fae
sha256:bbe8ebb5d0a64d265558901c7c6c66e1d09f664da57cdb2e5f69ba52a7109d31
sha256:b2edaadd7dd62cfe7f551b902244ee67b84bc5c0b6538b9480ac9ca97a0a4986
sha256:0fca65d33e353bdfdd5edd8d4c8ab5efde52c078bd25e2dcf454f995e5420725
sha256:d6d771d0512387eee1e419a965b929a9a3b0365cf1935b3719d60bf9feffcf63
sha256:dee8cd26669373102db07820072127c46bbfdad340a586ee9dfe60ae933eac2b

$ get-remote-layers $DEBIAN
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3

$ get-remote-layers $OPENJDK
sha256:6fcf2156bc23db75595b822b865fbc962ed6f4521dec8cae509e66742a6a5ad3
sha256:96fde6667c188c81fcddee021ccbb3e054ebe83350fd4609e17a3d37f0ec7f9d
sha256:74d17759dd2a1b51afc740fadd96f655260689a2087308e40d1865a0098c5fae
sha256:bbe8ebb5d0a64d265558901c7c6c66e1d09f664da57cdb2e5f69ba52a7109d31

One other caveat with distribution digests in repositories is that you can only compare digests of the same manifest schema version. So, if an image was pushed with manifest v1 it won't have the same digest pushed again with manifest v2.

TL;DR

Images contain the layers of their ancestor image(s). Therefore, if an image A contains a strict subset of image B layers, you know that image B is a descendent of image A.

You can use this property of Docker images to determine the base images from which your images were derived.

How to find out the base image for a docker image

Theoretical example

Practical Example, local images

Caveats, when you don't have images locally

"Distribution" (remote) digests vs "Content" (local) digests

Example, using remote images

TL;DR

Tags:

Docker

Related

Recent Posts