I have a question about the order in which sklearn's GridSearchCV object handles its hyperparameter combinations. Specifically, I performed a gridsearch using sklearn with parameters:
param1 = [val1, val2, val3, val4, val5]
param2 = [num1, num2]
The mean_test_score
attribute of cv_results_
is an array of length 10 as expected ( len(param1)*len(param2)
); however, I do not know which value corresponds to what combination. That is, are the values of param1
held which param2
is cycled or vice versa.
That is, do the 10 values in mean_test_score
correspond to
[ [val1, num1], [val1, num2], [val2, num1], [val2, num2], ... ]
(where param2
is cycled before param1
) or
[ [val1, num1], [va2, num1], [val3, num1], [val4, num1], [val5, num1], [val1, num2], ... ]
(where param1
is cycled before param2
). Does it just depend on the order in which they are specified in the grid search? Can I return the results along one specific hyperparameter value?
Thanks!
GridSearchCV
uses the class named ParameterGrid
inside, that you can check here (lines 47, 114)
This is more or less what ParameterGrid
does inside your GridSearchCV
:
from itertools import product
grid_values= [{"param1": [1, 2, 3, 4, 5], "param2": [1, 2]}]
def grid(grid_values):
for p in grid_values:
# Always sort the keys of a dictionary, for reproducibility
print(p)
items = sorted(p.items())
if not items:
yield {}
else:
keys, values = zip(*items)
for v in product(*values):
params = dict(zip(keys, v))
yield params
It first of all wrap your dict in a list (because it can handle different kind of data as input, for example a list of dicts)
grid_values= [{"param1": [1, 2, 3, 4, 5], "param2": [1, 2]}]
after that it performs a sort on the keys of your dict, for reproducibility purpose. Which will determine your combinations
items = sorted(p.items())
then it uses the product
function from itertools
that does what you thought (here details). A nested for loop on your variables. But starting with values sorted by the parameters' names!
for v in product(*values):
params = dict(zip(keys, v))
yield params