Search code examples
pythonmachine-learningscikit-learnneural-networkoptuna

How to set hidden_layer_sizes in sklearn MLPRegressor using optuna trial


I would like to use [OPTUNA][1] with sklearn [MLPRegressor][1] model.

For almost all hyperparameters it is quite straightforward how to set OPTUNA for them. For example, to set the learning rate: learning_rate_init = trial.suggest_float('learning_rate_init ',0.0001, 0.1001, step=0.005)

My problem is how to set it for hidden_layer_sizes since it is a tuple. So let's say I would like to have two hidden layers where the first will have 100 neurons and the second will have 50 neurons. Without OPTUNA I would do:

MLPRegressor( hidden_layer_sizes =(100,50))

But what if I want OPTUNA to try different neurons in each layer? e.g., from 100 to 500, how can I set it? the MLPRegressor expects a tuple


Solution

  • You could set up your objective function as follows:

    import optuna
    import warnings
    from sklearn.datasets import make_regression
    from sklearn.model_selection import train_test_split
    from sklearn.neural_network import MLPRegressor
    from sklearn.metrics import mean_squared_error
    warnings.filterwarnings('ignore')
    
    X, y = make_regression(random_state=1)
    
    X_train, X_valid, y_train, y_valid = train_test_split(X, y, random_state=1)
    
    def objective(trial):
    
        params = {
            'learning_rate_init': trial.suggest_float('learning_rate_init ', 0.0001, 0.1, step=0.005),
            'first_layer_neurons': trial.suggest_int('first_layer_neurons', 10, 100, step=10),
            'second_layer_neurons': trial.suggest_int('second_layer_neurons', 10, 100, step=10),
            'activation': trial.suggest_categorical('activation', ['identity', 'tanh', 'relu']),
        }
    
        model = MLPRegressor(
            hidden_layer_sizes=(params['first_layer_neurons'], params['second_layer_neurons']),
            learning_rate_init=params['learning_rate_init'],
            activation=params['activation'],
            random_state=1,
            max_iter=100
        )
    
        model.fit(X_train, y_train)
    
        return mean_squared_error(y_valid, model.predict(X_valid), squared=False)
    
    study = optuna.create_study(direction='minimize')
    study.optimize(objective, n_trials=3)
    # [I 2021-11-11 18:04:02,216] A new study created in memory with name: no-name-14c92e38-b8cd-4b8d-8a95-77158d996f20
    # [I 2021-11-11 18:04:02,283] Trial 0 finished with value: 161.8347337123744 and parameters: {'learning_rate_init ': 0.0651, 'first_layer_neurons': 20, 'second_layer_neurons': 40, 'activation': 'tanh'}. Best is trial 0 with value: 161.8347337123744.
    # [I 2021-11-11 18:04:02,368] Trial 1 finished with value: 159.55535852658082 and parameters: {'learning_rate_init ': 0.0551, 'first_layer_neurons': 90, 'second_layer_neurons': 70, 'activation': 'relu'}. Best is trial 1 with value: 159.55535852658082.
    # [I 2021-11-11 18:04:02,440] Trial 2 finished with value: 161.73980822730888 and parameters: {'learning_rate_init ': 0.0051, 'first_layer_neurons': 100, 'second_layer_neurons': 30, 'activation': 'identity'}. Best is trial 1 with value: 159.55535852658082.