kubernetes google-kubernetes-engine devops kubernetes-pod cost-management

How to configure K8s cluster to utilize spare CPU capacity for ML training jobs (or other low-priority CPU-intensive work)

I'd like to use spare CPU capacity in our kubernetes cluster for low-priority jobs -- specifically ML training using Tensorflow in this case -- without depriving higher-priority services on our cluster from CPU when they suddenly spike, akin to how one would with OS process priority. Currently we configure our autoscaler to add more nodes if CPU usage exceeds 60%, meaning as much as 40% of our CPU is unused at all times.

Questions: (1) Is this possible with K8s? After some experimentation it seems that Pod priority is not exactly the same, as my lower priority deployment does not instantly yield back CPU to my higher priority deployment. (2) If not possible, is there another generally-used strategy to utilize intentionally-overprovisioned CPU capacity, but yield immediately to higher priority services?

Solution

According to https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/resource-qos.md#qos-classes

In an overcommitted system (where sum of limits > machine capacity) containers might eventually have to be killed, for example if the system runs out of CPU or memory resources. Ideally, we should kill containers that are less important. For each resource, we divide containers into 3 QoS classes: Guaranteed, Burstable, and Best-Effort, in decreasing order of priority.

You can do like:

Set high to Guaranteed

containers:
  name: high
    resources:
      limits:
        cpu: 8000m
        memory: 8Gi

Set ml-job to Best-Effort.

containers:
  name: ml-job

I'm not sure if your ml-job is killable. If not, then this strategy might not suitable to you.