I was following pipelines tutorial, create all needed files, started the kedro with kedro run --node=preprocessing_data
but got stuck with such error message:
ValueError: Pipeline does not contain nodes named ['preprocessing_data'].
If I run kedro without node
parameter, I receive
kedro.context.context.KedroContextError: Pipeline contains no nodes
Contents of the files:
src/project/pipelines/data_engineering/nodes.py
def preprocess_data(data: SparkDataSet) -> None:
print(data)
return
src/project/pipelines/data_engineering/pipeline.py
def create_pipeline(**kwargs):
return Pipeline(
[
node(
func=preprocess_data,
inputs="data",
outputs="preprocessed_data",
name="preprocessing_data",
),
]
)
src/project/pipeline.py
def create_pipelines(**kwargs) -> Dict[str, Pipeline]:
de_pipeline = de.create_pipeline()
return {
"de": de_pipeline,
"__default__": Pipeline([])
}
I think it looks like you need to have the pipeline in __default__
.
e.g.
def create_pipelines(**kwargs) -> Dict[str, Pipeline]:
de_pipeline = de.create_pipeline()
return {
"de": data_engineering_pipeline,
"__default__": data_engineering_pipeline
}
Then kedro run --node=preprocessing_data
works for me.