Search code examples
azureairflowdatabricksdatabricks-workflows

Pass runtime parameter to two dependent tasks in azure


I am triggering a job in workflow from airflow and passing two runtime parameters in the job . My first task is a wheel file and the second dependent task.

enter image description here

below is the JSON API structure in databricks.

 "tasks": [
      {
        "task_key": "Job1",
        "run_if": "ALL_SUCCESS",
        "**python_wheel_task**": {
          "package_name": "test",
          "entry_point": "test",
          "named_parameters": {
            "**parm1**": "1",
            "**parm2**": "1",
            "**parm3**": "5",
            "**parm4**": "devl",
            "**parm5**": "yyyymm"
          }
        },
        "job_cluster_key": "Job_cluster",
        "libraries": [
          {
            "whl": "dbfs path"
          }
        ],
      {
        "task_key": "Job2",
        "depends_on": [
          {
            "task_key": "Job1"
          }
        ],
        "run_if": "ALL_SUCCESS",
        "**notebook_task**": {
          "notebook_path": "notebook path",
          "base_parameters": {
            "**parm4**": "devl",
            "**parm5**": "yyyymm"
          },
          "source": "WORKSPACE"
        }
      }
    ]

I want the parm4 and parm5 to be passed into Job2 as a run parameter. but instead the variables are coming as run parameter in Job1 but while triggering Job2 the similar values are not reflecting. In my airflow i have kept below structure to pass the value

"job_id": job_id,
        "python_named_params": {
            "parm1": 1,
            "parm2": 2,
            "parm3": 3,
            "parm4": "devl",
            "parm5": "yyyymm",
        }

As my Job1 is python wheel so I am calling python_named_params.

I hope I am able to explain the problem statement. I can explain in more details if anyone wants to help any information. Thank you in advance.

I have tried to run the job from airflow, which is not able to pass the run parameter to the dependent job.


Solution

  • You need to pass the parameter according to the Type of the task.

    As per this documentation,

    For a Python wheel task, use python_named_params, and

    For a Notebook task, use notebook_params.

    Below is my Notebook task.

    enter image description here

    I am running the job with the following JSON body.

    {
      "job_id": 872837015160716,
      "python_named_params": {
        "notebook_param1": "param1go"
      },
      "notebook_params": {
        "notebook_param2": "param2go"
      }
    }
    

    Output:

    enter image description here

    and

    enter image description here

    As you can see, it takes only the Notebook task parameters and displays them in task2. Since there is no Python wheel task, it doesn't take the parameter in task1 and prints the default value.