How does Horizontal Pod Autoscaler (HPA) evaluate metrics for pods with sidecar containers?

I'm currently navigating the intricacies of Horizontal Pod Autoscaler (HPA) behavior in the context of pods housing a sidecar container. Here's the context:

I've set up a pod serving images, paired with a sidecar container running nginx for caching. Both containers have their own resource requests defined. While the system is performing well, I'm seeking clarity on the HPA's scaling decision-making process.

Currently, the HPA triggers scaling when CPU request usage hits 80%. However, I'm unsure whether the HPA evaluates metrics for each container independently or if it considers both containers' metrics collectively.

In essence, does the HPA analyze metrics for individual containers separately? If so, how does it prioritize their evaluation? Alternatively, does it treat both containers as a single unit when assessing CPU request consumption?

I'd appreciate any insights into the HPA's metric evaluation approach in this specific setup. Thank you for your help!

Solution

Per Kubernetes official documentation.

The HorizontalPodAutoscaler API also supports a container metric source where the HPA can track the resource usage of individual containers across a set of Pods, in order to scale the target resource. This lets you configure scaling thresholds for the containers that matter most in a particular Pod. For example, if you have a web application and a logging sidecar, you can scale based on the resource use of the web application, ignoring the sidecar container and its resource use.

If you revise the target resource to have a new Pod specification with a different set of containers, you should revise the HPA spec if that newly added container should also be used for scaling. If the specified container in the metric source is not present or only present in a subset of the pods then those pods are ignored and the recommendation is recalculated.

To use container resources for autoscaling define a metric source as follows, an example where the HPA controller scales the target such that the average utilization of the cpu in the application container of all the pods is 60%.:

type: ContainerResource
containerResource:
  name: cpu
  container: application
  target:
    type: Utilization
    averageUtilization: 60

Additional info with algorithm used by HPA