Search code examples
loggingdata-sciencemlflow

How to update a previous run into MLFlow?


I would like to update previous runs done with MLFlow, ie. changing/updating a parameter value to accommodate a change in the implementation. Typical uses cases:

  • Log runs using a parameter A, and much later, log parameters A and B. It would be useful to update the value of parameter B of previous runs using its default value.
  • "Specialize" a parameter. Implement a model using a boolean flag as a parameter. Update the implementation to take a string instead. Now we need to update the values of the parameter for the previous runs so that it stays consistent with the new behavior.
  • Correct a wrong parameter value loggued in the previous runs.

It is not always easy to trash the whole experiment as I need to keep the previous runs for statistical purpose. I would like also not to generate new experiments just for a single new parameter, to keep a single database of runs.

What is the best way to do this?


Solution

  • To add or correct a parameter, metric or artifact of an existing run, pass run_id instead of experiment_id to mlflow.start_run function

    with mlflow.start_run(run_id="your_run_id") as run:
        mlflow.log_param("p1","your_corrected_value")
        mlflow.log_metric("m1",42.0) # your corrected metrics
        mlflow.log_artifact("data_sample.html") # your corrected artifact file
    

    You can correct, add to, or delete any MLflow run any time after it is complete. Get the run_id either from the UI or by using mlflow.search_runs.

    Source: https://towardsdatascience.com/5-tips-for-mlflow-experiment-tracking-c70ae117b03f