Some features are selected after running SelectKBest and the result is returned as an array, so I have no idea which features they are since my training set has thousands of features. I want to locate and pick out these features in my test set and remove the rest. Is there any convenient way to do so? Thanks!
The codes are like:
from sklearn.feature_selection import SelectKBest, f_regression
X_opt=SelectKBest(f_regression,k=2000)
X_new=X_opt.fit_transform(df_train_X_mm, train_y)
X_new`
And the result is:
array([[0. , 0. , 0. , ..., 0. , 0. ,
0. ],
[0. , 0. , 0.00688335, ..., 0. , 0. ,
0. ],
[0. , 0. , 0. , ..., 0. , 0. ,
0. ],
...,
[0. , 0. , 0. , ..., 0. , 0. ,
0. ],
[0. , 0. , 0. , ..., 0. , 0. ,
0. ],
[0. , 0. , 0.06257587, ..., 0. , 0. ,
0. ]])
What you are looking for is the get_support
method of feature_selection.SelectKBest
. It returns an array of booleans representing whether a given feature was selected (True
) or not (False
).