kubernetes google-kubernetes-engine google-container-registry

Auto-scaling is taking more time to bring up new pod and giving connection error in google container engine

I have used following command for autoscaling.

kubectl autoscale deployment catch-node --cpu-percent=50 --min=1 --max=10

The status of autoscaling in my case on load test is as like below .

27th minute

NAME         REFERENCE                     TARGET    CURRENT   MINPODS   MAXPODS   AGE
catch-node   Deployment/catch-node/scale   50%       20%      1         10        27m

NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
catch-node   1         1         1            1           27m

29th minute

NAME         REFERENCE                     TARGET    CURRENT   MINPODS   MAXPODS   AGE
catch-node   Deployment/catch-node/scale   50%       35%      1         10        29m

NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
catch-node   1         1         1            1           29m

31st minute

NAME         REFERENCE                     TARGET    CURRENT   MINPODS   MAXPODS   AGE
catch-node   Deployment/catch-node/scale   50%       55%      1         10        31m

NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
catch-node   1         1         1            1           31m

34th minute

NAME         REFERENCE                     TARGET    CURRENT   MINPODS   MAXPODS   AGE
catch-node   Deployment/catch-node/scale   50%       190%      1         10        34m

NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
catch-node   4         4         4            4           34m

Here i am getting connection refusing error in the time between transition of 1 pod to 4pods on autoscaling. Please let me know how much time it will take to bring up new pods once it exceed the CPU % limit given during autoscale .Also please let me know is there any method to reduce this time .once all new pods comes up, the issue is not there . Thanks in advance

Solution

As documented in this doc, there are two factors affect the reaction time of the autoscaler:

--horizontal-pod-autoscaler-sync-period, which defines how often the autoscaler checks the status of the controlled resources. The default value is 30s. It can be changed via the flag of the controller-manager.
upscaleForbiddenWindow, which defines how often the autoscaler can scale up the resource. The default value is 3 mins. Currently it's not adjustable.

According to the log you pasted, if the load is stable, the autoscaler should reacted in 30s after CPU usage reaches 55%, is that the case?