Search code examples
scikit-learnlightgbm

TypeError: booster must be Booster or LGBMModel


I have a LightGBMclassifier and want to plot its features importance, for that, I used the line bellow :

import lightgbm as lgb

lgb.plot_importance(model, figsize=(8,6))

but I get this error :

TypeError: booster must be Booster or LGBMModel.

can someone help please ? I do not get this error.

and here's how I built the model :

   lgbmc_classifier = LGBMClassifier(
        boosting_type='goss',
      colsample_bytree=0.17,
      lambda_l1=69.12,
      lambda_l2=0.0001,
      learning_rate=0.09,
      max_bin=512,
      min_child_samples=3,
      n_estimators=7782,
      num_leaves=10,
      subsample=0.26,
      random_state=356,
      importance_type='gain',
    )

model = Pipeline([
    ("preprocessor", preprocessor),
    ("standardizer", standardizer),
    ("classifier", lgbmc_classifier),
])

#model



model.fit(X_train, y_train)

##################################################

for data processing, on the model pipeline I also use :

for col in ["Time_Taken_hours", "Date"]:
    timestamp_transformer = TimestampTransformer()
    ohe_transformer = ColumnTransformer(
        [("ohe", OneHotEncoder(sparse=False, handle_unknown="ignore"), [timestamp_transformer.HOUR_COLUMN_INDEX])],
        remainder="passthrough")
    timestamp_preprocessor = Pipeline([
        ("extractor", timestamp_transformer),
        ("onehot_encoder", ohe_transformer)
    ])
    transformers.append((f"timestamp_{col}", timestamp_preprocessor, [col]))


from sklearn.compose import ColumnTransformer

preprocessor = ColumnTransformer(transformers, remainder="passthrough", sparse_threshold=0)

###################################################

standardizer = StandardScaler()

Solution

  • According to your error, model is not of the type that lgb.plot_importance expects, namely Booster or LGBMModel.

    This is kind of clear, since you define model by

    model = Pipeline([
        ("preprocessor", preprocessor),
        ("standardizer", standardizer),
        ("classifier", lgbmc_classifier),
    ])
    

    The Pipeline function returns a class Pipeline instead of Booster or LGBMModel Pipeline.

    But you need LGBMModel or Booster. So how do you get that? Well, you gave it the name classifier, hence you can access it using model["classifier"]. This is what you can use:

    import lightgbm as lgb
    from sklearn.pipeline import Pipeline
    from sklearn.datasets import load_iris
    
    X_train, y_train = load_iris(return_X_y=True)
    
    lgbmc_classifier = lgb.LGBMClassifier(
        boosting_type='goss',
        learning_rate=0.01,
        min_child_samples=10,  
        n_estimators=100,
        num_leaves=10,
        random_state=1,
        importance_type='gain',
    )
    lgbmc_classifier.fit(X_train, y_train)
    
    model = Pipeline([
        ("classifier", lgbmc_classifier)
    ])
    
    lgb.plot_importance(model['classifier'], figsize=(4,2))
    

    returns

    enter image description here

    Note that I have tweaked your regularization parameters a bit in order to be able to gain some results on the iris dataset. Also, I removed the two other Pipeline elements as I don't know how you use, them, but for the result that doesn't matter in this case.

    You could also just get the data, and match the data with the feature columns, e.g. using plotly:

    import plotly.express as px
    importances = pd.DataFrame({"feature_names": X_train.columns, "values": lgbmc_classifier.feature_importances_})
    fig = px.bar(importances, x="feature_names", y="values" , width=400, height=400)
    fig.show()
    

    enter image description here

    Or maybe you wanted a pie chart instead

    fig = px.pie(importances, names="feature_names", values="values" , width=400, height=400)
    fig.show()
    

    enter image description here