Search code examples
pythonscikit-learnensemble-learning

TypeError: Cannot clone object. You should provide an instance of scikit-learn estimator instead of a class


I am trying to use a stacking classifier along with 3 base learners of random forest, boosting and SVM and 1 meta learner of logistic regression.

However I keep getting this error message.

from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler


rf = RandomForestClassifier(n_estimators=100,random_state=100)
gb = GradientBoostingClassifier(n_estimators=100,random_state=100)
svm = make_pipeline(StandardScaler(), SVC(random_state=100))

estimators = [('RF', rf),
          ('GB', gb),
          ('SVM', svm)]

Model = StackingClassifier(estimators=estimators, final_estimator=LogisticRegression)

Model.fit(X_train,y_train).score(X_val,y_val)

But I keep getting this error.

TypeError                                 Traceback (most recent call last)
<ipython-input-47-40186fb4189e> in <module>
----> 1 Model.fit(X_train,y_train).score(X_val,y_val)

~\anaconda3\lib\site-packages\sklearn\ensemble\_stacking.py in fit(self, X, y, sample_weight)
    437         self._le = LabelEncoder().fit(y)
    438         self.classes_ = self._le.classes_
--> 439         return super().fit(X, self._le.transform(y), sample_weight)
    440 
    441     @if_delegate_has_method(delegate='final_estimator_')

~\anaconda3\lib\site-packages\sklearn\ensemble\_stacking.py in fit(self, X, y, sample_weight)
    138         # 'drop' string.
    139         names, all_estimators = self._validate_estimators()
--> 140         self._validate_final_estimator()
    141 
    142         stack_method = [self.stack_method] * len(all_estimators)

~\anaconda3\lib\site-packages\sklearn\ensemble\_stacking.py in _validate_final_estimator(self)
    406 
    407     def _validate_final_estimator(self):
--> 408         self._clone_final_estimator(default=LogisticRegression())
    409         if not is_classifier(self.final_estimator_):
    410             raise ValueError(

~\anaconda3\lib\site-packages\sklearn\ensemble\_stacking.py in _clone_final_estimator(self, 
default)
     55     def _clone_final_estimator(self, default):
     56         if self.final_estimator is not None:
 --> 57             self.final_estimator_ = clone(self.final_estimator)
     58         else:
     59             self.final_estimator_ = clone(default)

~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

~\anaconda3\lib\site-packages\sklearn\base.py in clone(estimator, safe)
     62             if isinstance(estimator, type):
     63                 raise TypeError("Cannot clone object. " +
 --> 64                                 "You should provide an instance of " +
     65                                 "scikit-learn estimator instead of a class.")
     66             else:

 TypeError: Cannot clone object. You should provide an instance of scikit-learn estimator 
 instead 
 of a class.

I am applying this on the titanic Data set to use the power of all the algorithms at my disposal. I have never used stacked classification or regression before and hence this is my first time.

Thanks and Regards


Solution

  • As @amiola pointed out, you're missing the parenthesis after LogisticRegression which will create a new instance of that class:

    Model = StackingClassifier(estimators=estimators, final_estimator=LogisticRegression)
    

    should be

    Model = StackingClassifier(estimators=estimators, final_estimator=LogisticRegression())
                                                                                        ^^