Search code examples
dockerhashdocker-registrydocker-image

Why digests are different depend on registry?


AFAIK, image digest is a hash of image's manifest body.

When I pull busybox image from docker hub, and push it to my private registry, the digests get different.

$ docker pull busybox
...
Digest: sha256:2605a2c4875ce5eb27a9f7403263190cd1af31e48a2044d400320548356251c4
Status: Downloaded newer image for busybox:latest

$ docker tag busybox myregistry/busybox
$ docker push myregistry/busybox
...
08c2295a7fa5: Pushed
latest: digest: sha256:8573b4a813d7b90ef3876c6bec33db1272c02f0f90c406b25a5f9729169548ac size: 527

$ docker images --digests
myregistry/busybox    latest      sha256:8573b4a813d7b90ef3876c6bec33db1272c02f0f90c406b25a5f9729169548ac   efe10ee6727f        2 weeks ago         1.13MB
busybox               latest      sha256:2605a2c4875ce5eb27a9f7403263190cd1af31e48a2044d400320548356251c4   efe10ee6727f        2 weeks ago         1.13MB

The images are not changed at all, and the image ids are same as each other.

But why image digests get different?


Updated:

Interestingly, the digest from another private registry is exactly same with the digest by my private registry.

$ docker image inspect efe10ee6727f
...
"RepoDigests": [
            "myregistry/busybox@sha256:8573b4a813d7b90ef3876c6bec33db1272c02f0f90c406b25a5f9729169548ac",
            "busybox@sha256:2605a2c4875ce5eb27a9f7403263190cd1af31e48a2044d400320548356251c4",
            "anotherregistry/busybox@sha256:8573b4a813d7b90ef3876c6bec33db1272c02f0f90c406b25a5f9729169548ac"
        ],

Solution

  • First, there are multiple digests. Everything pushed to a registry is content addressable and referenced by a digest. And an image consists of multiple parts, the various filesystem layers, the image config, a manifest that combines those for a single platform image, and an index or manifest list that combines multiple image manifests digests for things like multi-platform images.

    In the container engine like Docker, the image ID digest is the digest of the config JSON. When pushed to a registry, the digest you see there, and in pinned references to an image, is the digest of the manifest or index. You can compare the image ID (digest of the config JSON) and the RepoDigests (digest of the manifest) in the docker inspect output:

    $ docker inspect busybox --format 'Id: {{.Id}}
    Repo Digest: {{index .RepoDigests 0}}'
    Id: sha256:efe10ee6727fe52d2db2eb5045518fe98d8e31fdad1cbdd5e1f737018c349ebb
    Repo Digest: busybox@sha256:2605a2c4875ce5eb27a9f7403263190cd1af31e48a2044d400320548356251c4
    

    For OCI manifests and Docker's v2 manifests, the registry/repository is not included in the manifest, so it is possible for them to be identical when pushed to different locations:

    {
        "schemaVersion": 2,
        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
        "config": {
            "mediaType": "application/vnd.docker.container.image.v1+json",
            "size": 7023,
            "digest": "sha256:b5b2b2c507a0944348e0303114d8d93aaaa081732b86451d9bce1f432a537bc7"
        },
        "layers": [
            {
                "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                "size": 32654,
                "digest": "sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f"
            },
            {
                "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                "size": 16724,
                "digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b"
            },
            {
                "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                "size": 73109,
                "digest": "sha256:ec4b8955958665577945c89419d1af06b5f7636b4ac3da7f12184802ad867736"
            }
        ]
    }
    

    However, when an image is pulled to a container engine with docker pull, the multi-platform image is dereferenced to a single platform image, the layers are decompressed and extracted, and the manifest is discarded. To push that image, the process is reversed, and can result in the digest changing if a different tool was used to first push the image. Different tools may write the manifest JSON slightly differently, including changes as minimal as white space and field ordering.

    To preserve the digest between registries, it's best to not pull the image to a container engine, and instead use a tool designed to copy the content without extracting it. I know of several tools for this, including go-containerregistry/crane from Google, ORAS from Microsoft, skopeo from RedHat, and regclient/regctl from myself. In addition to preserving the digest, each of these tools is intelligent about only pulling layers that do not exist on the destination registry, and none of them need privileged access to run.