Search code examples
pythonmachine-learningscikit-learnlinear-regressiongrid-search

what is difference between criterion and scoring in GridSearchCV


I have created a GradientBoostingRegressor model.

I use scoring parameter in GridSearchCV function to return MSE score.

I wonder if I use criterion in param_grids does it change my model? which is the correct way?

GBR = GradientBoostingRegressor()
param_grids = {
                'learning_rate'    : [0.01, 0.05, 0.07, 0.1, 0.3, 0.5 ],
                'n_estimators'     : [50,60,70,80,90,100],
                'max_depth'        : [1, 2, 3, 4],
                'min_samples_leaf' : [1,2,3,5,10,15],
                'min_samples_split': [2,3,4,5,10],  
                #'criterion' : ['mse']
}
    
kf = KFold(n_splits=3, random_state=42, shuffle=True)
gs = GridSearchCV(estimator=GBR, param_grid = param_grids , cv = kf, n_jobs=-1, 
return_train_score=True, scoring='neg_mean_squared_error') 

Solution

  • The criterion method evaluates the split in the tree. The scoring method evaluates the quality of the model as a whole.

    If you want to find out if it changes your model, why not just test it out? That's what GridSearchCV is good at. Default is friedman_mse, so:

    param_grids = {
                        'learning_rate'    : [0.01, 0.05, 0.07, 0.1, 0.3, 0.5 ],
                        'n_estimators'     : [50,60,70,80,90,100],
                        'max_depth'        : [1, 2, 3, 4],
                        'min_samples_leaf' : [1,2,3,5,10,15],
                        'min_samples_split': [2,3,4,5,10],  
                        'criterion' : ['friedman_mse', 'mse']
        }