Search code examples
pytorchazure-machine-learning-servicemlflow

How to save or log pytorch model using MLflow?


I am in main.py at the root directory at main.py calling the model script to train the model. The directory looks like this

enter image description here

After training the model, I am planning to save and log the PyTorch model using MLflow. Here’s the code

# Registering the model to the workspace
    mlflow.pytorch.log_model(
        pytorch_model= model,
        registered_model_name="use-case1-model",
        artifact_path="use-case1-model",
        input_example=df[['Title', 'Attributes']],
        conda_env=os.path.join("./dependencies", "conda.yaml"),
        code_paths="./models"
        ]

    )

    # Saving the model to a file
    mlflow.pytorch.save_model(
        pytorch_model= model,
        conda_env=os.path.join("./dependencies", "conda.yaml"),
        input_example=df[['Title', 'Attributes']],
        path=os.path.join(args.model, "use-case1-model"),
        code_paths="./models"
    )

But I am getting an error while saving the code paths, saying the directory is not found.

Question 1: is there a need to save the code paths and extra files parameter in my case?

Question 2: What's the right way to save the code paths directory?

https://mlflow.org/docs/latest/python_api/mlflow.pytorch.html


Solution

  • As per function definition, the parameter code_paths is for giving a list of local filesystem paths to Python file dependencies (or directories containing file dependencies).

    If your model having such kind dependencies you need to provide there paths in list to code_paths.

    The error you are getting about directory not found can resolved by taking abs path as below.

    code_pth = os.path.abspath("")+"/media/model/"
    conda_env = os.path.abspath("")+"/dependencies/"
    print(conda_env)
    print(code_pth)
    

    enter image description here

    I have used sklearn model to log and save.

    mlflow.sklearn.log_model(
    sk_model=clf,
    registered_model_name=registered_model_name,
    artifact_path=registered_model_name,
    code_paths=[code_pth],
    conda_env=os.path.join(conda_env, "conda.yaml")
    )
    

    Output: enter image description here

    enter image description here

    mlflow.sklearn.save_model(
    sk_model=clf,
    path=os.path.join(registered_model_name, "trained_model"),
    code_paths=[code_pth],
    conda_env=os.path.join(conda_env, "conda.yaml")
    )
    

    Output:

    enter image description here