Search code examples
djangogoogle-kubernetes-enginegke-networking

Ingress health check failing on Django GKE Deployemnt


I am deploying a frontend (Nuxt) and a backend (Django) using GKE Autopilot. A CI/CD pipeline has been set up from github to GKE. The frontend gets deployed and easily viewable on xpresstracker.app while the backend spits out server error 501 on backend.xpresstracker.app. Below is the related configurations and error;

Error message

error message

prodservice.yaml

apiVersion: v1
kind: Service
metadata:
  name: prod-xpressbackend-service
spec:
  selector:
    app: prod-xpressbackend
  ports:
    - protocol: TCP
      port: 8000
      targetPort: 8000
  type: NodePort

prodmanagedcert.yaml

apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
  name: prod-managed-cert
spec:
  domains:
    - xpresstracker.app
    - backend.xpresstracker.app

prodingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prod-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: xpresstrackerip
    networking.gke.io/managed-certificates: prod-managed-cert
    kubernetes.io/ingress.class: "gce"
  labels:
    app: prod-xpressbackend
spec:
  rules:
    - host: xpresstracker.app
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: prod-xpressfrontend-service
                port:
                  number: 3000
    - host: backend.xpresstracker.app
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: prod-xpressbackend-service
                port:
                  number: 8000

Deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prod-xpressbackend-deployment
  labels:
    app: prod-xpressbackend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prod-xpressbackend
  template:
    metadata:
      labels:
        app: prod-xpressbackend
    spec:
      containers:
        - name: prod-xpressbackend-container
          image: someimage:production
          imagePullPolicy: Always
          resources:
            requests:
              memory: "200Mi"
              cpu:    "0.2"
            limits:
              memory: "512Mi"
              cpu:    "0.5"
          ports:
            - containerPort: 8000
          env:
            - name: SECRET_KEY
              valueFrom:
                secretKeyRef:
                  name: proddatabasecredentials
                  key: SECRET_KEY
            - name: DB_ENGINE
              valueFrom:
                secretKeyRef:
                  name: proddatabasecredentials
                  key: DB_ENGINE
            - name: DB_NAME
              valueFrom:
                secretKeyRef:
                  name: proddatabasecredentials
                  key: DB_NAME
            - name: DB_USER
              valueFrom:
                secretKeyRef:
                  name: proddatabasecredentials
                  key: DB_USER
            - name: DB_PASS     
              valueFrom:
                  secretKeyRef:
                    name: proddatabasecredentials
                    key: DB_PASS
            - name: DB_HOST
              valueFrom:
                secretKeyRef:
                  name: proddatabasecredentials
                  key: DB_HOST
            - name: DB_PORT
              valueFrom:
                secretKeyRef:
                  name: proddatabasecredentials
                  key: DB_PORT
            - name: DEBUG
              valueFrom:
                secretKeyRef:
                  name: proddatabasecredentials
                  key: DEBUG
        - name: cloud-sql-proxy
          image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:latest
          args:
            - "--port=5432"
            - "some database"
            - "--credentials-file=/secrets/service_account.json"
          securityContext:
            runAsNonRoot: true
          volumeMounts:
          - name: sqlconnect
            mountPath: /secrets/
            readOnly: true
          resources:
            requests:
              memory: "200Mi"
              cpu:    "0.2"
            limits:
              memory: "200Mi"
              cpu:    "0.2"
      volumes:
      - name: sqlconnect
        secret:
          secretName: sqlcredentials

What I tried;

Firewall Rule

Adding a firewall rule called fw-allow-health-checks with;

  • priority of 100
  • ip ranges 35.191.0.0/16, 130.211.0.0/22 and 0.0.0.0/0
  • Allow all protocols and ports

This did not work either

200 response from the backend I ensure that the root backend gives a 200 response when reached

snippert from view.py

from django.http import HttpResponse

def root_view(request):
    return HttpResponse(status=200)

Snippert from urls.py

from .views import root_view

urlpatterns = [
   path('', root_view, name='root'),

Deleting resource

  • I have deleted and reployed all resources but still no change

Solution

  • The issue was a CORS issue. It was solved after I set

    ALLOWED_HOSTS = ['*']
    

    I suspect this was the issue because the Health check just checked the backend homepage which returned a 200 ok and if it was facing a CORS error then it would definitely fail.

    Up until this point I was only testing with Postman and everything seemed fine.