I'm using grid search to fit machine learning model parameters.
I typed in the following code (modified from the sklearn documentation page: http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html)
from sklearn import svm, grid_search, datasets, cross_validation
# getting data
iris = datasets.load_iris()
# grid of parameters
parameters = {'kernel':('linear', 'poly'), 'C':[1, 10]}
# predictive model (support vector machine)
svr = svm.SVC()
# cross validation procedure
mycv = cross_validation.StratifiedKFold(iris.target, n_folds = 2)
# grid search engine
clf = grid_search.GridSearchCV(svr, parameters, mycv)
# fitting engine
clf.fit(iris.data, iris.target)
However, when I look at clf.estimator
, I get the following:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
kernel='rbf', max_iter=-1, probability=False, random_state=None,
shrinking=True, tol=0.001, verbose=False)
How did I end up with a 'rbf' kernel? I didn't specify it as an option in my parameters.
What's going on?
Thanks!
P.S. I'm using '0.15-git' version for sklearn.
Addendum: I noticed that clf.best_estimator_
gives the right output. So what is clf.estimator
doing?
clf.estimator
is simply a copy of the estimator passed as the first argument to the GridSearchCV
object. Any parameters not grid searched over are determined by this estimator. Since you did not explicitly set any parameters for the SVC object svr
, it was given all default values. Therefore, because clf.estimator
is just a copy of svr
, printing the value of clf.estimator
returns an SVC object with default parameters. Had you instead written, e.g.,
svr = svm.SVC(C=4.3)
then the value of clf.estimator
would have been:
SVC(C=4.3, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
kernel='rbf', max_iter=-1, probability=False, random_state=None,
shrinking=True, tol=0.001, verbose=False)
There is no real value to the user in accessing clf.estimator
, but then again it wasn't meant to be a public attribute anyways (since it doesn't end with a "_").