Search code examples
pythoncommand-line-interfacekedro

Override nested parameters using kedro run CLI command


I am using nested parameters in my parameters.yml and would like to override these using runtime parameters for the kedro run CLI command:

train:
    batch_size: 32
    train_ratio: 0.9
    epochs: 5

The following doesn't seem to work:

kedro run --params  train.batch_size:64,train.epochs:50 

the values for epoch and batch_size are those from the parameters.yml. How can I override these parameters with the cli command?


Solution

  • The additional parameters get passed into the KedroContext object via load_context(Path.cwd(), env=env, extra_params=params) in kedro_cli.py. Here you can see that there's a callback (protected) function called _split_params which splits the key-value pairs on :.

    This _split_params first splits string on commas (to get multiple params) and then on colons. Actually adding a print/logging statement of what gets passed into extra_params will show you something like:

    {'train.batch_size': 64, 'train.epochs': 50}
    

    I think you have a couple options:

    1. Un-nesting the params. That way you will override them correctly.
    2. Adding custom logic to _split_params in kedro_cli.py to create a nested dictionary on . characters which gets passed into the func mentioned above. I think you can reuse a lot of the existing logic.

    NB: This was tested on kedro==0.16.2.

    NB2: The way kedro splits out nested params is using the _get_feed_dict and _add_param_to_feed_dict functions in context.py. Specifically, _add_param_to_feed_dict is a recursive function that unpacks a dictionary and formats as "{}.{}".format(key, value). IMO you can use the logic from here.