machine-learning neural-network hyperparameters

What orders of hyperparameter tuning

I have using Neural Network for a classification problem and I am now at the point to tune all the hyperparameters.

For now, I saw many different hyperparameters that I have to tune :

Learning rate
batch-size
number of iterations (epoch)

For now, my tuning is quite "manual" and I am not sure I am not doing everything in a proper way. Is there a special order to tune the parameters? E.g learning rate first, then batch size, then ... I am not sure that all these parameters are independent. Which ones are clearly independent and which ones are clearly not independent? Should we then tune them together? Is there any paper or article which talks about properly tuning all the parameters in a special order?

Solution

There is even more than that! E.g. the number of layers, the number of neurons per layer, which optimizer to chose, etc...

So the real work in training a neural network is actually finding the best-suited parameters.

I would say there is no clear guideline because training a machine learning algorithm, in general, is always task-specific. You see, there are many hyperparameters to tune, and you won't have time to try out every combination of each. For many hyperparameters, you will build somewhat of intuition on what a good choice would be, but for now, a great starting point is always using what has been proven by others to work. So if you find a paper on the same or similar task you could try to use the same or similar parameters as them too.

Just to share with you some small experiences I've made:

I rarely vary the learning rate. I mostly choose the Adam optimizer and stick with it.
The batch size I try to choose as big as possible without running out of memory
number of iterations you could just set to e.g. 1000. You can always look at the current loss and decide for yourself if you can stop when the net e.g. isn't learning anymore.

Keep in mind these are in no way rules or strict guidelines. Just some ideas until you've got a better intuition yourself. The more papers you've read and more nets you've trained you will understand what to chose when better. Hope this serves a good starting point at least.