Search code examples
pythonkedro

Pipeline can't find nodes in kedro


I was following pipelines tutorial, create all needed files, started the kedro with kedro run --node=preprocessing_data but got stuck with such error message:

ValueError: Pipeline does not contain nodes named ['preprocessing_data'].

If I run kedro without node parameter, I receive

kedro.context.context.KedroContextError: Pipeline contains no nodes

Contents of the files:

src/project/pipelines/data_engineering/nodes.py
def preprocess_data(data: SparkDataSet) -> None:
    print(data)
    return
src/project/pipelines/data_engineering/pipeline.py
def create_pipeline(**kwargs):
    return Pipeline(
        [
            node(
                func=preprocess_data,
                inputs="data",
                outputs="preprocessed_data",
                name="preprocessing_data",
            ),
        ]
    )
src/project/pipeline.py
def create_pipelines(**kwargs) -> Dict[str, Pipeline]:
    de_pipeline = de.create_pipeline()
    return {
        "de": de_pipeline,
        "__default__": Pipeline([])
    }

Solution

  • I think it looks like you need to have the pipeline in __default__. e.g.

    def create_pipelines(**kwargs) -> Dict[str, Pipeline]:
        de_pipeline = de.create_pipeline()
        return {
            "de": data_engineering_pipeline,
            "__default__": data_engineering_pipeline
        }
    

    Then kedro run --node=preprocessing_data works for me.