Search code examples
pythondatabricksazure-databricksmlflowmlops

Log Pickle files as a part of Mlflow run


I am running an MLflow experiment as a part of it I would like to log a few artifacts as a python pickle.

Ex: Trying out different categorical encoders, so wanted to log the encoder objects as a pickle file.

Is there a way to achieve this?


Solution

  • There are two functions for there:

    1. log_artifact - to log a local file or directory as an artifact
    2. log_artifacts - to log a contents of a local directory

    so it would be as simple as:

    with mlflow.start_run():
        mlflow.log_artifact("encoder.pickle")
    

    And you will need to use the custom MLflow model to use that pickled file, something like this:

    import mlflow.pyfunc
    
    class my_model(mlflow.pyfunc.PythonModel):
        def __init__(self, encoders):
            self.encoders = encoders
    
        def predict(self, context, model_input):
            _X = ...# do encoding using self.encoders.
            return str(self.ctx.predict([_X])[0])