Search code examples
pythonxgboostmlflow

Problem when loading a xgboost model from mlflow registry


I create a xgboost classifier:

   xg_reg = xgb.XGBClassifier(objective ='reg:squarederror',  learning_rate = 0.1,
                max_depth = 20, alpha = 10, n_estimators = 50, use_label_encoder=False)

After training the model, I am logging it to the MLFLow registry:

   mlflow.xgboost.log_model(
        xgb_model = xg_reg, 
        artifact_path = "xgboost-models",
        registered_model_name = "xgb-regression-model"
    )

In the remote UI, I can see the logged model:

artifact_path: xgboost-models
flavors:
  python_function:
    data: model.xgb
    env: conda.yaml
    loader_module: mlflow.xgboost
    python_version: 3.7.9
  xgboost:
    code: null
    data: model.xgb
    model_class: xgboost.sklearn.XGBClassifier
    xgb_version: 1.5.2
mlflow_version: 1.25.1
model_uuid: 5fd42554cf184d8d96afae34dbb96de2
run_id: acdccd9f610b4c278b624fca718f76b4
utc_time_created: '2022-05-17 17:54:53.039242

Now, on the server side, to load the logged model:

   model = mlflow.xgboost.load_model(model_uri=model_path)

which loads OK, but the model type is

<xgboost.core.Booster object at 0x00000234DBE61D00>

and the predictions are numpy.float32 (eg 0.5) instead of int64 (eg 0, 1) for the original model.

Any ideas what can be wrong? Many thanks!


Solution

  • It turns out this was caused by using different versions of mlflow. The model was uploaded to registry with the newest version but was loaded with a previous one. When updated the server to load it, it now works! :)