Search code examples
pythonscipyscikit-learnrandom-foresthyperparameters

How to tell RandomizedSearchCV to choose from distribution or None value?


Let's say we are trying to find best max_depth parameter of RandomForestClassifier. We are using RandomizedSearchCV:

from scipy.stats import randint as sp_randint
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV

rf_params = {              # Is this somehow possible?
              'max_depth': [sp_randint(1, 100), None],
            }

n_iter = 10

random_search = RandomizedSearchCV(RandomForestClassifier(), 
                                   verbose=50, 
                                   param_distributions=rf_params,
                                   n_iter=n_iter, 
                                   n_jobs=-1, 
                                   scoring='f1_micro')

random_search.fit(X_train, y_train)

Is it possible to tell RandomizedSearchCV to either choose from specified distribution sp_randint(1, 100) OR set parameter to None which will (as in docs): "...expand are nodes until all leaves are pure or until all leaves contain less than min_samples_split samples..."?

When I run this code right now I will get this error:

enter image description here


Solution

  • Also from the docs: "If a list is given, it is sampled uniformly." Use this:

    'max_depth': list(range(1, 100)) + [None]