Search code examples
pythongrid-search

gridSearch in loop **estimator should be an estimator implementing 'fit' method, 0 was passed** error


Please excuse my coding experience. I am trying to make a bunch of the regressions with GridSearch. I am trying to loop the whole to make the process faster but my code is not good enough and would not mind something to making even more efficient. Here is my code for it:

classifiers=[Lasso(max_iter=700,random_state=42), Ridge(max_iter=700,random_state=42), ElasticNet(max_iter=700,random_state=42)]

for clf in range(len(classifiers)):
    grd=GridSearchCV(clf,parameters)

    name = clf.__class__.__name__

    print("="*25)
    print(name)

    if clf==0:
       parameters={'alpha':[0.0005,0.0006,0.06,0.5,0.0001,0.01,1,2,3,4,4.4,4]}

    elif clf==1:
         parameters = {'alpha':[1,2,3,5,10,11,2,13,14,15]}

    else:
       parameters ={'alpha':[0.06,0.5,0.0001,0.01,1,2,3,4,4.4,4,5]}

grd.fit(X_train,y_train)
pred=grid.predict(X_test)

Rs = r2_score(y_test, pred)
rmse=np.sqrt(mean_squared_error(y_test,pred))

print('The R-squared is {:.4}'.format(Rs))
print('The root mean squared is {:.4}'.format(rmse))

The exact error I am having is the following:

estimator should be an estimator implementing 'fit' method, 0 was passed. An explanation would also be highly appreciated.


Solution

  • There are a few mistakes in your code:

    • You are using clf inside the GridSearchCV object, which is an integer not the classifier from the list you created.
    • You need to declare the variable parameters before passing into GridSearchCV
    • Finally, you need to move the fit, predict, r2_score and mean_absolute_error code inside the body of for loop, else it will only perform the calculations for the last classifier.

    Here is the corrected code (I am using Boston Dataset as an example):

    from sklearn.linear_model import Lasso, Ridge, ElasticNet
    from sklearn.model_selection import GridSearchCV
    from sklearn.datasets import load_boston
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import r2_score, mean_squared_error
    import numpy as np
    
    random_state = 42
    
    # Load boston dataset
    boston = load_boston()
    X, y = boston['data'], boston['target']
    X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                        random_state=random_state)
    
    classifiers=[Lasso(max_iter=700,random_state=random_state), 
                 Ridge(max_iter=700,random_state=random_state),
                 ElasticNet(max_iter=700,random_state=random_state)]
    
    for clf in range(len(classifiers)):
        # First declare the variable parameters
        if clf==0:
           parameters={'alpha':[0.0005,0.0006,0.06,0.5,0.0001,0.01,1,2,3,4,4.4,4]}
    
        elif clf==1:
             parameters = {'alpha':[1,2,3,5,10,11,2,13,14,15]}
    
        else:
           parameters ={'alpha':[0.06,0.5,0.0001,0.01,1,2,3,4,4.4,4,5]}
    
        # Use clf as index to get the classifier
        current_clf = classifiers[clf]
        grid=GridSearchCV(current_clf, parameters)
    
        # This is the correct classifier name, previously it returned int
        name = current_clf.__class__.__name__
    
        print("="*25)
        print(name)
    
        # Moved the below code inside the for loop
        grid.fit(X_train,y_train)
        pred=grid.predict(X_test)
    
        Rs = r2_score(y_test, pred)
        rmse=np.sqrt(mean_squared_error(y_test,pred))
    
        print('The R-squared is {:.4}'.format(Rs))
        print('The root mean squared is {:.4}'.format(rmse))
    

    You can view the working code in the Google Colab notebook here.