I'm developing a project which uses Backpropation algorithm. So I'm learning Backpropagation algorithm in scikit-learn.
mlp = MLPClassifier(hidden_layer_sizes=(hiddenLayerSize,), solver='lbfgs', learning_rate='constant',learning_rate_init=0.001, max_iter=100000, random_state=1)
There are different solver options as lbfgs, adam and sgd and also activation options. Are there any best practices about which option should be used for backpropagation?
solver is the argument to set the optimization algorithm here. In general setting sgd (stochastic gradient descent)
works best, also it achieves faster convergence. While using sgd
you apart from setting the learning_rate
you also need to set the momentum
argument (default value =0.9 works).
activation functions option is for, to introduce non-linearity of the model, if your model has many layers you have to use activation function such as relu (rectified linear unit)
to introduce no-linearity, else using multiple layers become useless. relu
is the most simplest and most useful activation function.