I am using this https://github.com/tensorflow/models/tree/master/official/resnet official tensorflow implementation of resnet to train a binary classifier on my own dataset. I modified a little bit of the input_fn in imagenet_main.py to do my own image loading and preprocessing. But after many times of parameter tuning, I can't make my model train properly. I can only find a set of parameters that let training accuracy increase reaching 100%, while the validation accuracy stay around 50% forever. The implementation uses piece-wise learning-rate. I tried initial learning rate from 0.1 to 1e-5 and weight decay from 1e-2 to 1e-5, and no convergence on validation set was found.
A suspicious observation is that during training, the l2 loss decrease slowly and steady while cross-entropy is very reluctant to decrease, staying around 0.69.
Any idea about what can I try further ?
Regarding my dataset and image preprocessing, The training data set is around 100K images. The validation set is around 10K. I just resize each image to 224*224 while keeping aspect ration and subtract 127 on each channel and divide them by 255.
Actually @Hua resnet have so many trainable parameters and it is trained on image net which has 1k classes. and your data-set has only two classes. Dense layers of resnet has 4k neurons which in result increase the number of trainable parameter. Now number of parameters are directly related to risk of over-fitting. Means that resnet model is not suitable for your data kindly make some changes to resnet. Try to decrease number of parameter. That may help –