Search code examples
dnskubernetesskydnskubernetes-health-check

SkyDNS MissingClusterDNS in pods


I installed kubernetes 1.2.4 on 3 REHEL7 servers (no internet access, everything is pushed by ansible).

EDIT: See the end of the question

I've got everything working great excepting the kube-dns example given in documentation. I made several tests, several configuraiton, recreate the entire pods... And I always have this "MissingClusterDNS" error:

enter code here20m     20m     2   {kubelet k8s-minion-1.XXXXXX}                   Warning     MissingClusterDNS   kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.

As you can see, kube-dns is running:

kubectl get svc kube-dns --namespace=kube-system
NAME       CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kube-dns   172.16.0.99   <none>        53/UDP,53/TCP   15m

And kubelete has got correct options

KUBELET_ARGS=" --cluster-dns=172.16.0.99 --cluster-domain=kubernetes.local "

Proof:

ps ax | grep kubelet
 6077 ?        Ssl    0:07 /opt/kubernetes/bin/kubelet --logtostderr=true --v=0 --address=0.0.0.0 --port=10250 --hostname-override=k8s-minion-1.XXXXXX --api-servers=http://k8s-master.XXXXXX:8080 --allow-privileged=false  --cluster-dns=172.16.0.99 --cluster-domain=kubernetes.local

But, the DNS pod as a not running container:

kubectl get pods  --namespace=kube-system
NAME                 READY     STATUS             RESTARTS   AGE
kube-dns-v11-f2f4a   3/4       CrashLoopBackOff   7          18m

And the log is explicit:

Warning Unhealthy   Readiness probe failed: Get http://172.16.23.2:8081/readiness: dial tcp 172.16.23.2:8081: connection refused
...
Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "kube2sky" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube2sky pod=kube-dns-v11-f2f4a_kube-system(27d70b7c-36f9-11e6-b4fe-fa163ee85c45)"

If you need more information:

$ kubectl get pods  --namespace=kube-system
NAME                 READY     STATUS             RESTARTS   AGE
kube-dns-v11-f2f4a   3/4       CrashLoopBackOff   7          18m

-------------------------  

$ kubectl describe rc  --namespace=kube-system
Name:       kube-dns-v11
Namespace:  kube-system
Image(s):   our.registry/gcr.io/google_containers/etcd-amd64:2.2.1,our.registry/gcr.io/google_containers/kube2sky:1.14,our.registry/gcr.io/google_containers/skydns:2015-10-13-8c72f8c,our.registry/gcr.io/google_containers/exechealthz:1.0
Selector:   k8s-app=kube-dns,version=v11
Labels:     k8s-app=kube-dns,kubernetes.io/cluster-service=true,version=v11
Replicas:   1 current / 1 desired
Pods Status:    1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Volumes:
  etcd-storage:
    Type:   EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium: 
Events:
  FirstSeen LastSeen    Count   From                SubobjectPath   Type        Reason          Message
  --------- --------    -----   ----                -------------   --------    ------          -------
  19m       19m     1   {replication-controller }           Normal      SuccessfulCreate    Created pod: kube-dns-v11-f2f4a

-------------------------------------------------------

$ kubectl get all --all-namespaces 
NAMESPACE     NAME                 DESIRED        CURRENT            AGE
kube-system   kube-dns-v11         1              1                  24m
NAMESPACE     NAME                 CLUSTER-IP     EXTERNAL-IP        PORT(S)         AGE
default       kubernetes           172.16.0.1     <none>             443/TCP         27m
kube-system   kube-dns             172.16.0.99   <none>             53/UDP,53/TCP   24m
NAMESPACE     NAME                 READY          STATUS             RESTARTS        AGE
default       busybox              1/1            Running            0               23m
kube-system   kube-dns-v11-f2f4a   3/4            CrashLoopBackOff   9               24m

If someone can help me to understand the problem...

Note I'm using https://github.com/kubernetes/kubernetes/tree/release-1.2/cluster/addons/dns rc and svc where I only changed:

  • clusterIp to a valid ip in my kubernetes ip range
  • cluster domain: kubernetes.local
  • cluster dns: 172.16.0.99

EDIT: The problem for the 3/4 kube-dns working comes from certificates. So I can confirm that SkyDNS is working now.

NAME                 READY          STATUS        RESTARTS        AGE
kube-dns-v11-c96d5   4/4            Running       0               9m

Using cluster-api-tester:

kubectl logs --tail=80 kube-dns-v11-c96d5 kube2sky --namespace=kube-system
I0621 13:27:52.070730       1 kube2sky.go:462] Etcd server found: http://127.0.0.1:4001
I0621 13:27:53.073614       1 kube2sky.go:529] Using https://192.168.0.1:443 for kubernetes master
I0621 13:27:53.073632       1 kube2sky.go:530] Using kubernetes API <nil>
I0621 13:27:53.074020       1 kube2sky.go:598] Waiting for service: default/kubernetes
I0621 13:27:53.166188       1 kube2sky.go:660] Successfully added DNS record for Kubernetes service.

But others problems appear.

  • now "Using kubernetes API nil" instead of the good version
  • the busybox example from kubernetes documentation still won't resolv kubernetes.local

I will more investigate. But I resolved the skydns starting issue. Thanks


Solution

  • One of your DNS containers isn't ready. That's what "Ready 3/4" means.

    The best bet is to use the kubectl logs <pod> <container> command to get the logs of the container that is failing. You can add kubectl logs --previous ... if you need to get the logs from a container that already failed.

    Hopefully that will give you the information necessary to debug why that container isn't coming up.