Search code examples
pythonscikit-learnensemble-learning

'ExtraTreesClassifier' object has no attribute 'estimators_' Error


I am trying to fit the ExtraTreesClassifier() from sklearn.ensemble on a sample dataset, but it keeps throwing this error. I have implemented other sklearn models and they seem to run well. What am I missing here?

from sklearn.ensemble import ExtraTreesClassifier
model = ExtraTreesClassifier()
model.fit(X,y)

The error is thrown when I am calling the ExtraTreesClassifier function.

This is the full error. here best is just a dictionary that contains the parameters, and df is a dataframe that I using to store the outputs of different models I have made.

   ---> 97       df.loc[ind,'model']=ExtraTreesClassifier(**best)
     98       df.loc[ind,'param']=str(best)
     99       Start=time.time()

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    669             key = com.apply_if_callable(key, self.obj)
    670         indexer = self._get_setitem_indexer(key)
--> 671         self._setitem_with_indexer(indexer, value)
    672 
    673     def _validate_key(self, key, axis: int):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
    848                             indexer, self.obj.axes
    849                         )
--> 850                         self._setitem_with_indexer(new_indexer, value)
    851 
    852                         return self.obj

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
   1008                 # we have an equal len list/ndarray
   1009                 elif _can_do_equal_len(
-> 1010                     labels, value, plane_indexer, lplane_indexer, self.obj
   1011                 ):
   1012                     setter(labels[0], value)

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _can_do_equal_len(labels, value, plane_indexer, lplane_indexer, obj)
   2474         True if we have an equal len settable.
   2475     """
-> 2476     if not len(labels) == 1 or not np.iterable(value) or is_scalar(plane_indexer[0]):
   2477         return False
   2478 

/usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py in iterable(y)
    281     """
    282     try:
--> 283         iter(y)
    284     except TypeError:
    285         return False

/usr/local/lib/python3.6/dist-packages/sklearn/ensemble/_base.py in __iter__(self)
    171     def __iter__(self):
    172         """Return iterator over estimators in the ensemble."""
--> 173         return iter(self.estimators_)
    174 
    175 

AttributeError: 'ExtraTreesClassifier' object has no attribute 'estimators_'

Solution

  • When setting elements of a dataframe with .loc, pandas tries to unpack an iterable, thinking you're wanting to set more than one entry of the dataframe, one per element of the iterable. You can see in the traceback that pandas tests whether your ExtraTreesClassifier is an iterable:

        282     try:
    --> 283         iter(y)
        284     except TypeError:
        285         return False
    

    And unfortunately, ExtraTreesClassifier is iterable, containing each of its trees, though that of course only works after it's fitted, and hence the error.


    I'd suggest first that storing model objects in a dataframe is a little against the spirit of a dataframe, and that you instead save the model objects in another place. Maybe saving the model name and best params is enough in the frame?

    Anyway, if you want to save the model object itself, it's down to "how do I set a dataframe entry to an iterable," which has been asked before, e.g. Create and set an element of a Pandas DataFrame to a list and I personally like the answer "use at".