Search code examples
numpyscikit-learncross-validationkernel-densitygridsearchcv

AttributeError: 'numpy.ndarray' object has no attribute '_iter_test_masks'


I am trying to use sklearn GridSearchCV to perform K-fold cross-validation to select a bandwidth for KernelDensity estimation.

When I implement grid.fit(data), I receive the error:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Users\mubar\AppData\Local\Programs\Python\Python38-32\lib\site-packages\sklearn\utils\validation.py", line 73, in inner_f
    return f(**kwargs)
  File "C:\Users\mubar\AppData\Local\Programs\Python\Python38-32\lib\site-packages\sklearn\model_selection\_search.py", line 736, in fit
    self._run_search(evaluate_candidates)
  File "C:\Users\mubar\AppData\Local\Programs\Python\Python38-32\lib\site-packages\sklearn\model_selection\_search.py", line 1188, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
  File "C:\Users\mubar\AppData\Local\Programs\Python\Python38-32\lib\site-packages\sklearn\model_selection\_search.py", line 714, in evaluate_candidates
    in product(candidate_params,
  File "C:\Users\mubar\AppData\Local\Programs\Python\Python38-32\lib\site-packages\sklearn\model_selection\_split.py", line 80, in split
    for test_index in self._iter_test_masks(X, y, groups):
AttributeError: 'numpy.ndarray' object has no attribute '_iter_test_masks'

Here is my code:

import numpy as np
from sklearn.model_selection import GridSearchCV, LeaveOneOut

train = np.random.rand(12,2)
target = np.array([0,0,1,2,3,3,3,4,5,5,6,6])

bw = np.linspace(0.01,0.1,10)
grid = GridSearchCV(KernelDensity(kernel='gaussian'),
                    {'bandwidth': bw},
                    cv=LeaveOneOut)
grid.fit(train,target[:,None])

Solution

  • You just forgot to instantiate the LeaveOneOut cross-validator in your GridSearchCV definition.

    train = np.random.rand(12,2)
    target = np.array([0,0,1,2,3,3,3,4,5,5,6,6])
    
    bw = np.linspace(0.01,0.1,10)
    grid = GridSearchCV(KernelDensity(kernel='gaussian'),
                        {'bandwidth': bw},
                        cv=LeaveOneOut() # <-- typo was here
    )
    
    grid.fit(train,target[:,None])
    

    This will resolve the issue.