Search code examples
dockerdocker-registry

How are Docker image names parsed?


When doing a docker push or when pulling an image, how does Docker determine if there is a registry server in the image name or if it is a path/username on the default registry (e.g. Docker Hub)?

I'm seeing the following from the 1.1 image specification:

Tag

A tag serves to map a descriptive, user-given name to any single image ID. Tag values are limited to the set of characters [a-zA-Z_0-9].

Repository

A collection of tags grouped under a common prefix (the name component before :). For example, in an image tagged with the name my-app:3.1.4, my-app is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must follow comply with standard DNS rules, but may not contain _ characters. If a hostname is present, it may optionally be followed by a port number in the format :8080. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator.

For the DNS host name, does it need to be fully qualified with dots, or is "my-local-server" a valid registry hostname? For the name components, I'm seeing periods as valid, which implies "team.user/appserver" is a valid image name. If the registry server is running on port 80, and therefore no port number is needed on the hostname in the image name, it seems like there would be ambiguity between the hostname and the path on the registry server. I'm curious how Docker resolves that ambiguity.


Solution

  • TL;DR: The hostname must contain a . dns separator, a : port separator, or the value "localhost" before the first /. Otherwise the code assumes you want the default registry, Docker Hub.


    After some digging through the code, I came across distribution/distribution/reference/reference.go with the following:

    // Grammar
    //
    //  reference                       := name [ ":" tag ] [ "@" digest ]
    //  name                            := [hostname '/'] component ['/' component]*
    //  hostname                        := hostcomponent ['.' hostcomponent]* [':' port-number]
    //  hostcomponent                   := /([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9])/
    //  port-number                     := /[0-9]+/
    //  component                       := alpha-numeric [separator alpha-numeric]*
    //  alpha-numeric                   := /[a-z0-9]+/
    //  separator                       := /[_.]|__|[-]*/
    //
    //  tag                             := /[\w][\w.-]{0,127}/
    //
    //  digest                          := digest-algorithm ":" digest-hex
    //  digest-algorithm                := digest-algorithm-component [ digest-algorithm-separator digest-algorithm-component ]
    //  digest-algorithm-separator      := /[+.-_]/
    //  digest-algorithm-component      := /[A-Za-z][A-Za-z0-9]*/
    //  digest-hex                      := /[0-9a-fA-F]{32,}/ ; At least 128 bit digest value
    

    The actual implementation of that is via a regex in distribution/distribution/reference/regexp.go.

    But with some digging and poking, I found that there's another check beyond that regex (e.g. you'll get errors with an uppercase hostname if you don't don't include a . or :). And I tracked down the actual split of the name to the following in distribution/distribution/reference/normalize.go:

    // splitDockerDomain splits a repository name to domain and remotename string.
    // If no valid domain is found, the default domain is used. Repository name
    // needs to be already validated before.
    func splitDockerDomain(name string) (domain, remainder string) {
        i := strings.IndexRune(name, '/')
        if i == -1 || (!strings.ContainsAny(name[:i], ".:") && name[:i] != "localhost") {
            domain, remainder = defaultDomain, name
        } else {
            domain, remainder = name[:i], name[i+1:]
        }
        if domain == legacyDefaultDomain {
            domain = defaultDomain
        }
        if domain == defaultDomain && !strings.ContainsRune(remainder, '/') {
            remainder = officialRepoName + "/" + remainder
        }
        return
    }
    

    The important part of that for me is the check for the ., :, or the hostname localhost before the first / in the first if statement. With it, the hostname is split out from before the first /, and without it, the entire name is passed to the default registry hostname.