I have fitted a ElasticNetCV in Python with three splits:
import numpy as np
from sklearn.linear_model import LinearRegression
#Sample data:
num_samples = 100 # Number of samples
num_features = 1000 # Number of features
X = np.random.rand(num_samples, num_features)
Y = np.random.rand(num_samples)
#Model
l1_ratios = np.arange(0.1, 1.1, 0.1)
tscv=TimeSeriesSplit(max_train_size=None, n_splits=3)
regr = ElasticNetCV(cv=tscv.split(X), random_state=42,l1_ratio=l1_ratios)
regr.fit(X,Y)
Now I want to get the whole grid of combinations of hyperparameters with the corresponding MSE as a Data Frame, I tried the following. However, the problem is that the resulting data frame shows the MSE for a combination of hyperparameters which are not shown as the minimum in the ElasticNetCV object which can be obtained by regr.alpha_
and regr.l1_ratio_
:
mse_path = regr.mse_path_
alpha_path = regr.alphas_
# Reshape mse_path to have l1_ratios, n_alphas, cross_validation_step as separate columns
mse_values = mse_path.flatten()
alpha_values = alpha_path.flatten()
l1_values=np.tile(l1_ratios ,int(alpha_values.shape[0]/l1_ratios.shape[0]))
repeated_l1_ratios = np.repeat(l1_ratios, 100)
# mse has dimensions (11, 100, 3)
array_3d = mse
# Flatten the 3D array into a 2D array
# Each sub-array of shape (100, 3) becomes a row in the new 2D array
array_2d = array_3d.reshape(-1, 3)
# Create a DataFrame from the 2D array
df = pd.DataFrame(array_2d, columns=['MSE Split1', 'MSE Split2', 'MSE Split3'])
df['alpha_values'] = alpha_values
df['l1_values'] = repeated_l1_ratios
The following then results in a hyperparameter combination that is not the true one. So when combining the MSEs and the hyperparameter values, something is wrong:
# Calculate the minimum MSE for each row across the three splits
df['Min MSE'] = df[['MSE Split1', 'MSE Split2', 'MSE Split3']].min(axis=1)
# Identify the row with the overall minimum MSE
min_mse_row_index = df['Min MSE'].idxmin()
# Retrieve the row with the minimum MSE
min_mse_row = df.loc[min_mse_row_index]
print("Row with the minimum MSE across all splits:")
print(min_mse_row)
It looks to me like you've got everything right up to defining the dataframe of hyperparameters and their fold-scores.
But the "best" set of hyperparameters is the one that minimizes the MSE averaged over folds (not minimized over folds).