dns kubernetes skydns kubernetes-health-check

SkyDNS MissingClusterDNS in pods

I installed kubernetes 1.2.4 on 3 REHEL7 servers (no internet access, everything is pushed by ansible).

EDIT: See the end of the question

I've got everything working great excepting the kube-dns example given in documentation. I made several tests, several configuraiton, recreate the entire pods... And I always have this "MissingClusterDNS" error:

enter code here20m     20m     2   {kubelet k8s-minion-1.XXXXXX}                   Warning     MissingClusterDNS   kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.

As you can see, kube-dns is running:

kubectl get svc kube-dns --namespace=kube-system
NAME       CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kube-dns   172.16.0.99   <none>        53/UDP,53/TCP   15m

And kubelete has got correct options

KUBELET_ARGS=" --cluster-dns=172.16.0.99 --cluster-domain=kubernetes.local "

Proof:

ps ax | grep kubelet
 6077 ?        Ssl    0:07 /opt/kubernetes/bin/kubelet --logtostderr=true --v=0 --address=0.0.0.0 --port=10250 --hostname-override=k8s-minion-1.XXXXXX --api-servers=http://k8s-master.XXXXXX:8080 --allow-privileged=false  --cluster-dns=172.16.0.99 --cluster-domain=kubernetes.local

But, the DNS pod as a not running container:

kubectl get pods  --namespace=kube-system
NAME                 READY     STATUS             RESTARTS   AGE
kube-dns-v11-f2f4a   3/4       CrashLoopBackOff   7          18m

And the log is explicit:

Warning Unhealthy   Readiness probe failed: Get http://172.16.23.2:8081/readiness: dial tcp 172.16.23.2:8081: connection refused
...
Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "kube2sky" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube2sky pod=kube-dns-v11-f2f4a_kube-system(27d70b7c-36f9-11e6-b4fe-fa163ee85c45)"

If you need more information:

$ kubectl get pods  --namespace=kube-system
NAME                 READY     STATUS             RESTARTS   AGE
kube-dns-v11-f2f4a   3/4       CrashLoopBackOff   7          18m

-------------------------  

$ kubectl describe rc  --namespace=kube-system
Name:       kube-dns-v11
Namespace:  kube-system
Image(s):   our.registry/gcr.io/google_containers/etcd-amd64:2.2.1,our.registry/gcr.io/google_containers/kube2sky:1.14,our.registry/gcr.io/google_containers/skydns:2015-10-13-8c72f8c,our.registry/gcr.io/google_containers/exechealthz:1.0
Selector:   k8s-app=kube-dns,version=v11
Labels:     k8s-app=kube-dns,kubernetes.io/cluster-service=true,version=v11
Replicas:   1 current / 1 desired
Pods Status:    1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Volumes:
  etcd-storage:
    Type:   EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium: 
Events:
  FirstSeen LastSeen    Count   From                SubobjectPath   Type        Reason          Message
  --------- --------    -----   ----                -------------   --------    ------          -------
  19m       19m     1   {replication-controller }           Normal      SuccessfulCreate    Created pod: kube-dns-v11-f2f4a

-------------------------------------------------------

$ kubectl get all --all-namespaces 
NAMESPACE     NAME                 DESIRED        CURRENT            AGE
kube-system   kube-dns-v11         1              1                  24m
NAMESPACE     NAME                 CLUSTER-IP     EXTERNAL-IP        PORT(S)         AGE
default       kubernetes           172.16.0.1     <none>             443/TCP         27m
kube-system   kube-dns             172.16.0.99   <none>             53/UDP,53/TCP   24m
NAMESPACE     NAME                 READY          STATUS             RESTARTS        AGE
default       busybox              1/1            Running            0               23m
kube-system   kube-dns-v11-f2f4a   3/4            CrashLoopBackOff   9               24m

If someone can help me to understand the problem...

Note I'm using https://github.com/kubernetes/kubernetes/tree/release-1.2/cluster/addons/dns rc and svc where I only changed:

clusterIp to a valid ip in my kubernetes ip range
cluster domain: kubernetes.local
cluster dns: 172.16.0.99

EDIT: The problem for the 3/4 kube-dns working comes from certificates. So I can confirm that SkyDNS is working now.

NAME                 READY          STATUS        RESTARTS        AGE
kube-dns-v11-c96d5   4/4            Running       0               9m

Using cluster-api-tester:

kubectl logs --tail=80 kube-dns-v11-c96d5 kube2sky --namespace=kube-system
I0621 13:27:52.070730       1 kube2sky.go:462] Etcd server found: http://127.0.0.1:4001
I0621 13:27:53.073614       1 kube2sky.go:529] Using https://192.168.0.1:443 for kubernetes master
I0621 13:27:53.073632       1 kube2sky.go:530] Using kubernetes API <nil>
I0621 13:27:53.074020       1 kube2sky.go:598] Waiting for service: default/kubernetes
I0621 13:27:53.166188       1 kube2sky.go:660] Successfully added DNS record for Kubernetes service.

But others problems appear.

now "Using kubernetes API nil" instead of the good version
the busybox example from kubernetes documentation still won't resolv kubernetes.local

I will more investigate. But I resolved the skydns starting issue. Thanks

Solution

One of your DNS containers isn't ready. That's what "Ready 3/4" means.

The best bet is to use the kubectl logs <pod> <container> command to get the logs of the container that is failing. You can add kubectl logs --previous ... if you need to get the logs from a container that already failed.

Hopefully that will give you the information necessary to debug why that container isn't coming up.