Search code examples
pythonmachine-learningscikit-learnfeature-extractionfeature-selection

How do I know which features are selected with SelectKBest?


Some features are selected after running SelectKBest and the result is returned as an array, so I have no idea which features they are since my training set has thousands of features. I want to locate and pick out these features in my test set and remove the rest. Is there any convenient way to do so? Thanks!

The codes are like:

from sklearn.feature_selection import SelectKBest, f_regression
X_opt=SelectKBest(f_regression,k=2000)
X_new=X_opt.fit_transform(df_train_X_mm, train_y)
X_new`

And the result is:

array([[0.        , 0.        , 0.        , ..., 0.        , 0.        ,
    0.        ],
   [0.        , 0.        , 0.00688335, ..., 0.        , 0.        ,
    0.        ],
   [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
    0.        ],
   ...,
   [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
    0.        ],
   [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
    0.        ],
   [0.        , 0.        , 0.06257587, ..., 0.        , 0.        ,
    0.        ]])

Solution

  • What you are looking for is the get_support method of feature_selection.SelectKBest. It returns an array of booleans representing whether a given feature was selected (True) or not (False).