I have a GKE cluster of 5 nodes in the same zone. I'm trying to deploy an Elasticsearch statefulset of 3 nodes on the kube-system namespace, but every time I do the statefulset gets deleted and the pods get into the Terminating state immediately after the creation of the second pod.
I tried to check the pod logs and to describe the pod for any information but found nothing useful.
I even checked the GKE cluster logs where I detected the deletion request log but with no extra information of who is initiating it or why is it happening.
When I changed the namespace to default everything was fine and the pods were in the ready state.
Below is the manifest file I'm using for this deployment.
# RBAC authn and authz
apiVersion: v1
kind: ServiceAccount
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
# addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: elasticsearch-logging
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
# addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- "services"
- "namespaces"
- "endpoints"
verbs:
- "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: kube-system
name: elasticsearch-logging
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
# addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
name: elasticsearch-logging
namespace: kube-system
apiGroup: ""
roleRef:
kind: ClusterRole
name: elasticsearch-logging
apiGroup: ""
---
# Elasticsearch deployment itself
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
version: 7.16.2
kubernetes.io/cluster-service: "true"
# addonmanager.kubernetes.io/mode: Reconcile
spec:
serviceName: elasticsearch-logging
replicas: 2
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
k8s-app: elasticsearch-logging
version: 7.16.2
template:
metadata:
labels:
k8s-app: elasticsearch-logging
version: 7.16.2
kubernetes.io/cluster-service: "true"
spec:
serviceAccountName: elasticsearch-logging
containers:
- image: docker.elastic.co/elasticsearch/elasticsearch:7.16.2
name: elasticsearch-logging
resources:
# need more cpu upon initialization, therefore burstable class
limits:
cpu: 1000m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: db
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- name: elasticsearch-logging
mountPath: /data
env:
#Added by Nour
- name: discovery.seed_hosts
value: elasticsearch-master-headless
- name: "NAMESPACE"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumes:
- name: elasticsearch-logging
# emptyDir: {}
# Elasticsearch requires vm.max_map_count to be at least 262144.
# If your OS already sets up this number to a higher value, feel free
# to remove this init container.
initContainers:
- image: alpine:3.6
command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
name: elasticsearch-logging-init
securityContext:
privileged: true
volumeClaimTemplates:
- metadata:
name: elasticsearch-logging
spec:
storageClassName: "standard"
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 30Gi
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
# addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "Elasticsearch"
spec:
type: NodePort
ports:
- port: 9200
protocol: TCP
targetPort: db
nodePort: 31335
selector:
k8s-app: elasticsearch-logging
#Added by Nour
---
apiVersion: v1
kind: Service
metadata:
labels:
app: elasticsearch-master
name: elasticsearch-master
namespace: kube-system
spec:
ports:
- name: http
port: 9200
protocol: TCP
targetPort: 9200
- name: transport
port: 9300
protocol: TCP
targetPort: 9300
selector:
app: elasticsearch-master
sessionAffinity: None
type: ClusterIP
---
apiVersion: v1
kind: Service
metadata:
labels:
app: elasticsearch-master
name: elasticsearch-master-headless
namespace: kube-system
spec:
ports:
- name: http
port: 9200
protocol: TCP
targetPort: 9200
- name: transport
port: 9300
protocol: TCP
targetPort: 9300
clusterIP: None
selector:
app: elasticsearch-master
Below are the available namespaces
$ kubectl get ns
NAME STATUS AGE
default Active 4d15h
kube-node-lease Active 4d15h
kube-public Active 4d15h
kube-system Active 4d15h
Am I using any old API version that might cause the issue?
Thank you.
To close i think it would make sense to paste the final answer here.
I understand your curiousity, i guess GCP just started preventing people from deploying stuff to the kube-system namespaces as it has the risk of messing with GKE. I never tried to deploy stuff to the kube-system namespace before so i'm sure if it was always like this or we just changed it
Overall i recommend avoiding deploying stuff into the kube-system namespace in GKE```