Search code examples
kuberneteskube-dns

Kubernetes PetSet DNS not working


I have a Kubernetes PetSet with name == elasticsearch and serviceName == es. It does create pods and, as expected, they have names like elasticsearch-0 and elasticsearch-1. However, DNS does not seem to be working. elasticsearch-0.es does not resolve (nor does elasticsearch-0.default, etc.). If you look at the generated srv records they seem to be random instead of predictable:

# nslookup -type=srv elasticsearch
Server:        10.1.0.2
Address:    10.1.0.2#53

elasticsearch.default.svc.cluster.local    service = 10 100 0 9627d60e.elasticsearch.default.svc.cluster.local.

Anyone have any ideas?


Details

Here's the actual PetSet and Service definition:

---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  labels:
    app: elasticsearch
spec:
  ports:
  - name: rest
    port: 9200
  - name: native
    port: 9300
  clusterIP: None
  selector:
    app: elasticsearch
---
apiVersion: apps/v1alpha1
kind: PetSet
metadata:
  name: elasticsearch
spec:
  serviceName: "es"
  replicas: 2
  template:
    metadata:
      labels:
        app: elasticsearch
      annotations:
        pod.alpha.kubernetes.io/initialized: "true"
    spec:
      terminationGracePeriodSeconds: 0
      containers:
      - name: elasticsearch
        image: 672129611065.dkr.ecr.us-west-2.amazonaws.com/elasticsearch:v1
        ports:
          - containerPort: 9200
          - containerPort: 9300
        volumeMounts:
        - name: es-data
          mountPath: /usr/share/elasticsearch/data
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: ES_CLUSTER_NAME
            value: EsEvents
  volumeClaimTemplates:
  - metadata:
      name: es-data
      annotations:
        volume.alpha.kubernetes.io/storage-class: anything
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi

Solution

  • This was an issue of me mis-reading the documentation. The docs say:

    The network identity has 2 parts. First, we created a headless Service that controls the domain within which we create Pets. The domain managed by this Service takes the form: $(service name).$(namespace).svc.cluster.local, where “cluster.local” is the cluster domain. As each pet is created, it gets a matching DNS subdomain, taking the form: $(petname).$(governing service domain), where the governing service is defined by the serviceName field on the Pet Set.

    I took this to mean that the value of the serviceDomain field is the value of the "governing service domain", but that's not what it means. It means that the value of serviceDomain must match the name of an existing headless service and that service will be used to as the governing service domain. If no such service exists you don't get an error - you just get random DNS names for you pets.