I reduce dimensionality of a dataset (pandas DataFrame).
X = df.as_matrix()
sel = VarianceThreshold(threshold=0.1)
X_r = sel.fit_transform(X)
then I wanto to get back the reduced DataFrame (i.e. keep only ok columns)
I found only this ugly way to do so, which is very inefficient, do you have any cleaner idea?
cols_OK = sel.get_support() # which columns are OK?
c = list()
for i, col in enumerate(cols_OK):
if col:
c.append(df.columns[i])
return df[c]
I think you need if return mask
:
cols_OK = sel.get_support()
df = df.loc[:, cols_OK]
and if return indices:
cols_OK = sel.get_support()
df = df.iloc[:, cols_OK]