Search code examples
dockerkuberneteskubectlhostsubuntu-22.04

Kubernetes pull from image private network / fails to respect /etc/hosts of server


I am running a small 3 node test kubernetes cluster (using kubeadm) running on Ubuntu Server 22.04, with Flannel as the network fabric. I also have a separate gitlab private server, with container registry set up and working.

The problem I am running into is I have a simple test deployment, and when I apply the deployment yaml, it fails to pull the image from the gitlab private server.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: platform-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: platform-service
  template:
    metadata:
      labels:
        app: platform-service
    spec:
      containers:
        - name: platform-service
          image: registry.examle.com/demo/platform-service:latest

Ubuntu Server: /etc/hosts (the relevant line)

192.168.1.30 registry.example.com

The Error

Failed to pull image "registry.example.com/demo/platform-service:latest": 
rpc error: code = Unknown desc = failed to pull and unpack image 
"registry.example.com/deni/platform-service:latest": failed to resolve reference 
"registry.example.com/demo/platform-service:latest": failed to do request: Head 
"https://registry.example.com/v2/demo/platform-service/manifests/latest": dial tcp 
xxx.xxx.xxx.xxx:443: i/o timeout

The 'xxx.xxx.xxx.xxx' is related to my external network, to which exists a domain name in the DNS, however all of my internal networks are set up to attach to the internal network representation, and 'registry.example.com' is a representation of my own domains.

Also to note:

docker pull registry.example.com/demo/platform-service:latest

From the command line of the server, works perfectly fine, it is just not working from kubernetes deploy yaml.

The problem

While the network on the server, and the host files on the server are configured correctly, the docker image is not resolving because when I apply it is not using the correct IP (that is configured in hosts), rather a public IP that is a different server. And the reason for the timeout is because the public facing server is not set up the same.

When I run kubectl apply -f platform-service.yaml why does it not respect the hosts file of the server, and is there a way configure hosts inside Kubernetes.

(If this problem is not clear, I apologize, I am quite new, and still learning terminology, maybe why google is not helping me with this problem.)

The closest S/O I could find is:

Kubernetes not able pull image from private registry having private domain pointed via /etc/hosts

(SO Answer #1): hostAliases (this is for the pod itself, not pulling the image), also, installed through apt/package manager rather than snap. With the rest of the answer suggests changing the distribution, which I would rather go with my current setup than change it.

Update

Attempts to add hosts to coredns not working either: (How to change host name resolve like host file in coredns)

kubectl -n kube-system edit configmap/coredns
...
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        hosts custom.hosts registry.example.com {
            192.168.1.30 registry.example.com
            fallthrough
        }
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
...

deleted the coredns pods (so they are recreated)

and still the docker pull on the deployment fails with the external ip address instead of the internal address.


Solution

  • After going through many different solutions and lots of research and testing. The answer was actually very simple.

    Solution in my case

    The /etc/hosts file MUST contain the host for the registry (and possibly the entry for the gitlab instance as well) on EVERY node of the cluster including the master node.

    192.168.1.30 registry.example.com
    192.168.1.30 gitlab.example.com    # Necessary in my case, not sure required
    

    Once I included that on each of the 2 slaves, it attempted to pull the image, and failed with credential issues (which I was expecting to see once the hosts issue was resolved). From there I was able to add the credentials and now the image pulls fine from the private registry rather than the public facing registry.

    Bonus: Fix for credentials error connecting to private registry (not part of the original question, but part of the setup process for connecting)

    After fixing the /etc/hosts issue, you will probably need to set up 'regcred' credentials to access the private registry, Kubernetes documentation provides the steps on that part:

    https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/