Search code examples
pythondeep-learningnetflix-metaflow

Stop Metaflow from parallelising foreach steps


I recently started using Metaflow for my hyperparameter searches. I'm using a foreach for all my parameters as follows:

from metaflow import FlowSpec, step

@step
def start_hpo(self):
    self.next(self.train_model, foreach='hpo_parameters')

@step
def train_model(self):
    # Trains model...

This works as it starts the step train_model as intended but unfortunately it wants to parallelise all steps at once. This causes my gpu / cpu to run out of memory instantly failing the step.

Is there a way to tell metaflow to do these steps linearly / one at a time instead or another workaround?

Thanks


Solution

  • @BBQuercus You can limit parallelization by using the --max-workers flag.

    Currently, we run no more than 16 tasks in parallel and you can override it as python myflow.py run --max-workers 32 for example.