Search code examples
pythonazuredatabricksazure-databricksfb-hydra

How can I pass non-hydra based runtime command line arguments to a hydra based code?


I am using Hydra with Databricks workflows. The requirement for me is for my code to accept runtime command line parameters(NOT hydra config based parameters) from the work flow.

In my code I want to use argparse to retrieve these runtime parameters, need not be available along with the hydra decorator, but overall it is vital for me to have these runtime command line params available in my script.

My script(set_task_values.py) looks something like this:

import hydra
from omegaconf import DictConfig, OmegaConf
from hydra.core.hydra_config import HydraConfig
import argparse

@hydra.main(version_base=None, config_path="conf_tasks", config_name="config_dummy") #Hydra Decorator
def my_app(cfg : DictConfig) -> None:
    parser = argparse.ArgumentParser(add_help=False)
    parser.add_argument("--task_values", type=str, required=True)
    args, unknown = parser.parse_known_args()
    print("Task Values", "->", args.task_values)
    dbutils.jobs.taskValues.set(key   = "task_values", \
                                value = args.task_values)

    
    
if __name__ == "__main__":
    my_app()

My config (config_dummy.yaml) looks something like this, although its not relevant, since there is no issue in reading and retrieving hydra config.

operand:
  add
batch_size:
  64
learning_rate: 
  0.01
creation_did:
  47
custom:
  email: [email protected]
  exp_title: Exp_Tasks

hydra:
    run:
      dir: /dbfs/some_path/${custom.email}/${now:%Y-%m-%d}/${custom.exp_title}

This is how my Databricks workflow looks like:

Workflow details

Quick reference: Databricks workflow gives an option of "parameters", where you can input command line parameters to be passed to the task/script.

My command line parameter is this:
["--task_values","{'T1':0,'T2':1}"]

What I am trying to achieve is a set of external command line parameters as you can see in the above "parameters" section, there is a command line variable called task_values which I require in my script.

But when I run this job/workflow which internally runs my script(set_task_values.py) expecting the runtime command line parameter(task_values) to be retrieved, I get the following error:

Error Details

    usage: set_task_values.py [--help] [--hydra-help] [--version]
                              [--cfg {job,hydra,all}] [--resolve]
                              [--package PACKAGE] [--run] [--multirun]
                              [--shell-completion] [--config-path CONFIG_PATH]
                              [--config-name CONFIG_NAME]
                              [--config-dir CONFIG_DIR]
                              [--experimental-rerun EXPERIMENTAL_RERUN]
                              [--info [{all,config,defaults,defaults-tree,plugins,searchpath}]]
                              [overrides [overrides ...]]
    set_task_values.py: error: unrecognized arguments: --task_values
    /databricks/python/lib/python3.8/site-packages/IPython/core/interactiveshell.py:3445: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
      warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)

Can someone help with how I can use runtime command line parameters with hydra?


Solution

  • What you are trying is not supported. Passing command line arguments along with hydra.

    Refer this stack solution for more information.

    But there is option in hydra to override.

    Add task_values in config_dummy.yaml file like below.

    operand:
      add
    batch_size:
      64
    learning_rate: 
      0.01
    creation_did:
      47
    custom:
      email: [email protected]
      exp_title: Exp_Tasks
    
    task_values:
      random_txt
    
    hydra:
        run:
          dir: /dbfs/some_path/${custom.email}/${now:%Y-%m-%d}/${custom.exp_title}
    

    And pass the parameter in task like below.

    ["task_values={T1: 4, T2: 2}"]

    enter image description here

    Output:

    operand: add
    batch_size: 64
    learning_rate: 0.01
    creation_did: 47
    custom:
      email: [email protected]
      exp_title: Exp_Tasks
    task_values:
      T1: 0
      T2: 1
    

    enter image description here

    Refer this documentation for more about override.