I would like to cross-validate a neural network using the package neuralnet
and caret
.
The data df
can be copied from this post.
When running the neuralnet()
function, there is an argument called hidden
where you can set the hidden layers and neurons in each. Let's say I want 2 hidden layers with 3 and 2 neurons respectively. It would be written as hidden = c(3, 2)
.
However, as I want to cross-validate it, I decided to use the fantastic caret
package. But when using the function train()
, I do not know how to set the number of layers and neurons.
Does anyone know where can I add these numbers?
This is the code I ran:
nn <- caret::train(DC1 ~ ., data=df,
method = "neuralnet",
#tuneGrid = tune.grid.neuralnet,
metric = "RMSE",
trControl = trainControl (
method = "cv", number = 10,
verboseIter = TRUE
))
By the way, I am getting some warnings with the previous code:
predictions failed for Fold01: layer1=3, layer2=0, layer3=0 Error in cbind(1, pred) %*% weights[[num_hidden_layers + 1]] :
requires numeric/complex matrix/vector arguments
Ideas on how to solve it?
When using neural net model in caret in order to specify the number of hidden units in each of the three supported layers you can use the parameters layer1
, layer2
and layer3
. I found out by checking the source.
library(caret)
grid <- expand.grid(layer1 = c(32, 16),
layer2 = c(32, 16),
layer3 = 8)
Use case with BostonHousing data:
library(mlbench)
data(BostonHousing)
lets just select numerical columns for the example to make it simple:
BostonHousing[,sapply(BostonHousing, is.numeric)] -> df
nn <- train(medv ~ .,
data = df,
method = "neuralnet",
tuneGrid = grid,
metric = "RMSE",
preProc = c("center", "scale", "nzv"), #good idea to do this with neural nets - your error is due to non scaled data
trControl = trainControl(
method = "cv",
number = 5,
verboseIter = TRUE)
)
The part
preProc = c("center", "scale", "nzv")
is essential for the algorithm to converge, neural nets don't like unscaled features
Its super slow though.
nn
#output
Neural Network
506 samples
12 predictor
Pre-processing: centered (12), scaled (12)
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 405, 404, 404, 405, 406
Resampling results across tuning parameters:
layer1 layer2 RMSE Rsquared MAE
16 16 NaN NaN NaN
16 32 4.177368 0.8113711 2.978918
32 16 3.978955 0.8275479 2.822114
32 32 3.923646 0.8266605 2.783526
Tuning parameter 'layer3' was held constant at a value of 8
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were layer1 = 32, layer2 = 32 and layer3 = 8.