I am using the OneHotEncoder
from Scikit-learn in my project. And I need to know what would be the size of each one-hot vector when the n_value
is set to be auto
. I thought n_value_
would show that but it seems I have no way other than trying out training samples. I made this toy example code to show the problem. Do you know any other solution?
from sklearn.preprocessing import OneHotEncoder
data = [[1], [3], [5]] # 3 different features
encoder = OneHotEncoder()
encoder.fit(data)
print(len(encoder.transform([data[0]]).toarray()[0])) # 3 number of dimensions in one-hot-vector
print(encoder.n_values_) # [6] == len(range(5))
Is this what you are looking for?
>>> encoder.active_features_
array([1, 3, 5])
>>> len(encoder.active_features_)
3