I am working on training a segmentation network U-net on the LIDC-IDRI dataset. There are currently two training strategies:
With Dice coefficient as loss function, which is also used in V-net architecture (paper), model trained with Method 2 is always better than that with Method 1. The former can achieve a Dice score of 0.735, while the latter can only reach 0.71.
BTW, my U-net model is implemented in TensorFlow, and the model is trained on NVidia GTX 1080Ti
Could anyone give some explanation or references. Thanks!
Well, I read your answer and decided to try it, as it was fairly easy, as I've also been training Vnets on LIDC-IDRI. Usually I train on the whole dataset from the beginning. Option 2) gave faster boost in dice, however, soon it fell to 2% on validation and even after enabling the network to learn the whole dataset it did not recover, training dice. of course, was increasing. Seems my 10% of dataset were not quite representative and it badly overfit.