Search code examples
deep-learninglstmrecurrent-neural-network

deep learning concept - hyperparameter tuning weights RNN/LSTM


When we build a model and train it, the initial weights are randomly initialized, unless specified (seed).

As we know, there are a variety of parameters we can adjust like epochs, optimizers, batch_size, etc to find the "best" model.

The concept I have trouble with is: Even if we do find the best model after tuning, the weights will be different, yielding different models and results. So the best model for this maybe wouldn't be the best if we compiled and ran it again with the "best parameters". If we seed the weights with the parameters for reproducibility, we don't know if those would be the best weights. On the other hand, if we tune the weights, then the "best parameters" won't be best parameters anymore? I am stuck in a loop. Is there a general guideline on what parameters to tune first as opposed to others?

Or is this whole logic flawed somewhere and I am way overthinking?


Solution

    1. We initialize weights randomly to ensure that each node acts differently(unsymmetric) from others.
    2. Depending upon the hyperparameters(epochs, batch size etc, iterations,.)The weights are updated until the iterations last. In the end, we call the updated weights as models.
    3. Seed is used to control the randomness of initialization. If im not wrong, a good learning algorithm(Objective function and optimizer) converges irrespective of seed values.
    4. Again, A good model means tuning all the hyperparameters, making sure that the model is not underfitting.
    5. On the other hand, even the model shouldn't overfit.
    6. There is nothing like the best parameters(weights, bias), we need to continuously tune the model until the results are satisfactory and the main parts are data processing.