Half of My Kubernetes Cluster Is Never Used, Due to the Way Pods Are Scheduled

I have a Kubernetes cluster with 4 nodes and, ideally, my application should have 4 replicas, evenly distributed to each node. However, when pods are scheduled, they almost always end up on only two of the nodes, or if I'm very lucky, on 3 of the 4. My app as quite a bit of traffic and I would really want to use all the resources that I pay for.

I suspect the reason why this happens is that Kubernetes tries to schedule the new pods on the nodes that have the most available resources, which is nice as a concept, but it would be even nicer if it would reschedule the pods once the old nodes become available again.

What options do I have? Thanks.

Solution

You have lots of options!

First and foremost: Pod Affinity and Anti-affinity to make sure your Pod prefer to be placed on a host that does not already have a Pod with the same label.

Second, you could set up Pod Topology Spread Constraints. This is newer and a bit more advanced, but usually a better solution that simple anti-affinity.

Thirdly, you can pin your Pods to a specific node using a NodeSelector.

Finally, you could write your own scheduler or modify the default scheduler settings, but that's a bit more advanced topic. Don't forget to always set your resource requests correctly, these should be set to a value that more or less encapsulates the usage during peak traffic, to make sure that a node has enough resources available to max out the Pod without interfering with other Pods.