Search code examples
kubernetescloudmesosjob-schedulinghybrid-cloud

State-of-the-art job scheduling (containers, hybrid cloud)?


We have a kind of evaluation job which consists of several thousand invocations of a legacy binary with various inputs, each of which running like a minute. The individual runs are perfectly parallelizable (one instance per core).

What is the state of the art to do this in a hybrid cloud scenario?

Kubernetes itself does not seem to provide an interface for prioritizing or managing waiting jobs. Jenkins would be good at these points, but feels like a hack. Of course, we could hack something ourselves, but the problem should be sufficiently generic to already have an out-of-the box solution.


Solution

  • There are a lot of frameworks that helps managing jobs in Kubernetes cluster. The most popular are:

    • Argo for orchestrating parallel jobs on Kubernetes. Workflows is implemented as a Kubernetes CRD (Custom Resource Definition).
    • Airflow - has a modular architecture and uses a message queue to orchestrate an arbitrary number of workers. Also take a look for kubernetes-executor.

    I recommend you to look for this video which describe each of framework and help you decide which is better for you.