Search code examples
pythonscikit-learnxgboost

Feature Importance with XGBClassifier


Hopefully I'm reading this wrong but in the XGBoost library documentation, there is note of extracting the feature importance attributes using feature_importances_ much like sklearn's random forest.

However, for some reason, I keep getting this error: AttributeError: 'XGBClassifier' object has no attribute 'feature_importances_'

My code snippet is below:

from sklearn import datasets
import xgboost as xg
iris = datasets.load_iris()
X = iris.data
Y = iris.target
Y = iris.target[ Y < 2] # arbitrarily removing class 2 so it can be 0 and 1
X = X[range(1,len(Y)+1)] # cutting the dataframe to match the rows in Y
xgb = xg.XGBClassifier()
fit = xgb.fit(X, Y)
fit.feature_importances_

It seems that you can compute feature importance using the Booster object by calling the get_fscore attribute. The only reason I'm using XGBClassifier over Booster is because it is able to be wrapped in a sklearn pipeline. Any thoughts on feature extractions? Is anyone else experiencing this?


Solution

  • As the comments indicate, I suspect your issue is a versioning one. However if you do not want to/can't update, then the following function should work for you.

    def get_xgb_imp(xgb, feat_names):
        from numpy import array
        imp_vals = xgb.booster().get_fscore()
        imp_dict = {feat_names[i]:float(imp_vals.get('f'+str(i),0.)) for i in range(len(feat_names))}
        total = array(imp_dict.values()).sum()
        return {k:v/total for k,v in imp_dict.items()}
    
    
    >>> import numpy as np
    >>> from xgboost import XGBClassifier
    >>> 
    >>> feat_names = ['var1','var2','var3','var4','var5']
    >>> np.random.seed(1)
    >>> X = np.random.rand(100,5)
    >>> y = np.random.rand(100).round()
    >>> xgb = XGBClassifier(n_estimators=10)
    >>> xgb = xgb.fit(X,y)
    >>> 
    >>> get_xgb_imp(xgb,feat_names)
    {'var5': 0.0, 'var4': 0.20408163265306123, 'var1': 0.34693877551020408, 'var3': 0.22448979591836735, 'var2': 0.22448979591836735}