Search code examples
pythonazure-databricksazure-machine-learning-service

How to use PipelineParameter in DatabricksStep (Python)


I've created an AML Pipeline with a single DatabricksStep. I've need to pass a parameter to the Databricks notebook when I run the published pipeline.

When I run the published pipeline, the Databricks steps always take the default value of the PipelineParameter, no matter what value I choose when I submit the pipeline.

Here the code:

start_date_param = PipelineParameter(name="StartDate", default_value='2022-01-19')

# Define data ingestion step
data_loading_step = DatabricksStep(
        name="Data Loading",
        existing_cluster_id=db_cluster_id,
        notebook_path=data_loading_path,
        run_name="Loading raw data",
        notebook_params={
            'StartDate': start_date_param,
        },
        compute_target=dbricks_compute,
        instance_pool_id=instance_pool_id,
        num_workers=num_workers,
        allow_reuse=False
    )

And this is the Databricks notebook:

dbutils.widgets.text("StartDate", "", "StartDate(YYYY-MM-DD)")

The default value of StartDate is 2022-01-19.Even though I set the StartDate parameter to '2021-01-19' the Databricks notebook still takes 2022-01-19 as StartDate.

What am I doing wrong?

Thanks for any help,

G


Solution

  • I've found the solution of the problem. I hope it can be of help to someone. It works if you set all the parameter name and widget name in lower case.

    start_date_param = PipelineParameter(name="start_date", default_value='2022-01-19')
    
    # Define data ingestion step
    data_loading_step = DatabricksStep(
            name="Data Loading",
            existing_cluster_id=db_cluster_id,
            notebook_path=data_loading_path,
            run_name="Loading raw data",
            notebook_params={
                'start_date': start_date_param,
            },
            compute_target=dbricks_compute,
            instance_pool_id=instance_pool_id,
            num_workers=num_workers,
            allow_reuse=False
        )
    
    dbutils.widgets.text("start_date", "", "start_date(YYYY-MM-DD)")