We have hosted service in AKS which has RWO volumes with Deployment strategy as Recreate. We recently went live with this new service and we have many features/issues to be delivered everyday. Since the deployment strategy is Recreate, business team is experiencing some down time (2 min max) but it is annoying. Is there a better approach to manage RWO volumes with rolling update strategy ?
You have two types of strategies to choose from when specifying the way of updating your deployments:
Recreate Deployment: All existing Pods are killed before new ones are created.
Rolling Update Deployment: The Deployment updates Pods in a rolling update fashion.
The default and more recommended one is the .spec.strategy.type==RollingUpdate
. See the examples below:
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
In this example there would be one additional Pod (maxSurge: 1
) above the desired number of 2, and the number of available Pods cannot go lower than that number (maxUnavailable: 0
).
Choosing this config, the Kubernetes will spin up an additional Pod, then stop an “old” one. If there’s another Node available to deploy this Pod, the system will be able to handle the same workload during deployment. If not, the Pod will be deployed on an already used Node at the cost of resources from other Pods hosted on the same Node.
You can also try something like this:
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
With the example above there would be no additional Pods (maxSurge: 0
) and only a single Pod at a time would be unavailable (maxUnavailable: 1
).
In this case, Kubernetes will first stop a Pod before starting up a new one. The advantage of that is that the infrastructure doesn’t need to scale up but the maximum workload will be less.
If you chose to use the percentage values for maxSurge
and maxUnavailable
you need to remember that:
maxSurge
- the absolute number is calculated from the percentage by rounding up
maxUnavailable
- the absolute number is calculated from percentage by rounding down
With the RollingUpdate
defined correctly you also have to make sure your applications provide endpoints to be queried by Kubernetes that return the app’s status. Below it's a /greeting
endpoint, that returns an HTTP 200 status when it’s ready to handle requests, and HTTP 500 when it’s not:
readinessProbe:
httpGet:
path: /greeting
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
initialDelaySeconds
- Time (in seconds) before the first check for readiness is done.
periodSeconds
- Time (in seconds) between two readiness checks after the first one.
successThreshold
- Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness. Minimum value is 1.
timeoutSeconds
- Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
More on the topic of liveness/readiness probes can be found here.
These are only examples but they should give you the idea of that particular update strategy that could be used in order to eliminate the possibility of downtime.