Search code examples
mesosairflow

Customize task resources on Airflow using MesosExecutor


Is it possible to specify resources (CPU, memory, GPU, disk space) for each operator of a DAG when using MesosExecutor?

I know you can specify global values for resources of a task.

For instance, I have several operators that are CPU expensive and others that not. I would like to execute one at a time of the first, but many in parallel of the non CPU expensive ones.


Solution

  • From the code (mesos_executor.py line 67), it seems that is not possible since cpu and memory values are passed to the Scheduler during initialization:

        def __init__(self,
                 task_queue,
                 result_queue,
                 task_cpu=1,
                 task_mem=256):
        self.task_queue = task_queue
        self.result_queue = result_queue
        self.task_cpu = task_cpu
        self.task_mem = task_mem
    

    and those values are used without modification:

    cpus = task.resources.add()
                cpus.name = "cpus"
                cpus.type = mesos_pb2.Value.SCALAR
                cpus.scalar.value = self.task_cpu
    
                mem = task.resources.add()
                mem.name = "mem"
                mem.type = mesos_pb2.Value.SCALAR
                mem.scalar.value = self.task_mem
    

    It requires a custom Executor implementation to achieve that