amazon-web-services kubernetes kubernetes-pod

Kubernetes - force restarting on specific memory usage

our server running using Kubernetes for auto-scaling and we use newRelic for observability but we face some issues

1- we need to restart pods when memory usage reaches 1G it automatically restarts when it reaches 1.2G but everything goes slowly.

2- terminate pods when there no requests to the server

my configuration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Release.Name }}
  labels:
    app: {{ .Release.Name }}
spec:
  revisionHistoryLimit: 2
  replicas: {{ .Values.replicas }}
  selector:
    matchLabels:
      app: {{ .Release.Name }}
  template:
    metadata:
      labels:
        app: {{ .Release.Name }}
    spec:
      containers:
        - name: {{ .Release.Name }}
          image: "{{ .Values.imageRepository }}:{{ .Values.tag }}"
          env:
            {{- include "api.env" . | nindent 12 }}
          resources:
            limits:
              memory: {{ .Values.memoryLimit }}
              cpu: {{ .Values.cpuLimit }}
            requests:
              memory: {{ .Values.memoryRequest }}
              cpu: {{ .Values.cpuRequest }}
      imagePullSecrets:
        - name: {{ .Values.imagePullSecret }}      
      {{- if .Values.tolerations }}
      tolerations:
{{ toYaml .Values.tolerations | indent 8 }}
      {{- end }}
      {{- if .Values.nodeSelector }}
      nodeSelector:
{{ toYaml .Values.nodeSelector | indent 8 }}
      {{- end }}

my values file

memoryLimit: "2Gi"
cpuLimit: "1.0"
memoryRequest: "1.0Gi"
cpuRequest: "0.75"

thats what I am trying to approach

Solution

If you want to be sure your pod/deployment won't consume more than 1.0Gi of memory then setting that MemoryLimit will do job just fine.

Once you set that limits and your container exceed it it becomes a potential candidate for termination. If it continues to consume memory beyond its limit, the Container will be terminated. If a terminated Container can be restarted, kubelet restarts it, as with any other type of runtime container failure.

For more readying please visit section exceeding a container's memory limit

Moving on if you wish to scale your deployment based on requests you would require to have custom metrics to be provided by external adapter such as prometheus. Horizontal pod autoascaler natively provides you scaling based only on CPU and Memory (based on the metrics from metrics server).

The adapter documents provides you walkthrough how to configure it with Kubernetes API and HPA. The list of other adapters can be found here.

Then you can scale your deployment based on the http_requests metric as showed here or request-per-seconds as described here.