Search code examples

How to run a pipeline except for a few nodes?

I want to run a pipeline for different files, but some of them don't need all of the defined nodes. How can I pass them?


  • To filter out a few lines of a pipeline you can simply filter the pipeline list from inside of python, my favorite way is to use a list comprehension.

    by name

    nodes_to_run = [node for node in pipeline.nodes if 'dont_run_me' not in]
    run(nodes_to_run, io)

    by tag

    nodes_to_run = [node for node in pipeline.nodes if 'dont_run_tag' not in node.tags]
    run(nodes_to_run, io)

    It's possible to filter by any attribute tied to the pipeline node, (name, inputs, outputs, short_name, tags)

    If you need to run your pipeline this way in production or from the command line, you can either tag your pipeline to run with tags, or add a custom click.option to your run function inside of then run this filter when the flag is True.

    Note This assumes that you have your pipeline loaded into memory as pipeline and catalog loaded in as io