Search code examples
pythonscikit-learnrandom-forestgrid-search

Is there easy way to grid search without cross validation in python?


There is absolutely helpful class GridSearchCV in scikit-learn to do grid search and cross validation, but I don't want to do cross validataion. I want to do grid search without cross validation and use whole data to train. To be more specific, I need to evaluate my model made by RandomForestClassifier with "oob score" during grid search. Is there easy way to do it? or should I make a class by myself?

The points are

  • I'd like to do grid search with easy way.
  • I don't want to do cross validation.
  • I need to use whole data to train.(don't want to separate to train data and test data)
  • I need to use oob score to evaluate during grid search.

Solution

  • I would really advise against using OOB to evaluate a model, but it is useful to know how to run a grid search outside of GridSearchCV() (I frequently do this so I can save the CV predictions from the best grid for easy model stacking). I think the easiest way is to create your grid of parameters via ParameterGrid() and then just loop through every set of params. For example assuming you have a grid dict, named "grid", and RF model object, named "rf", then you can do something like this:

    for g in ParameterGrid(grid):
        rf.set_params(**g)
        rf.fit(X,y)
        # save if best
        if rf.oob_score_ > best_score:
            best_score = rf.oob_score_
            best_grid = g
    
    print "OOB: %0.5f" % best_score 
    print "Grid:", best_grid