CUDA_ERROR_OUT_OF_MEMORY on Tensorflow#object_detection/train.py

I'm running Tensorflow Object Detection API to train my own detector using the object_detection/train.py script, found here. The problem is that I'm getting CUDA_ERROR_OUT_OF_MEMORY constantly.

I found some suggestions to reduce the batch size so the trainer consumes less memory, but I reduced from 16 to 4 and I'm still getting the same error. The difference is that when using batch_size=16, the error was thrown in step ~18 and now it is been thrown in step ~70. EDIT: setting batch_size=1 didn't solve the problem, as I still got the error at step ~2700.

What can I do to make it run smoothly until I stop the training proccess? I don't really need to get a fast training.

EDIT: I'm currently using a GTX 750 Ti 2GB for this. The GPU is not being used for anything else than training and providing monitor image. Currently, I'm using only 80 images for training and 20 images for evaluation.

Solution

Found the solution for my problem. The batch_size was not the problem, but a higher batch_size made the training memory consumption increase faster, because I was using the config.gpu_options.allow_growth = True configuration. This setting allows Tensorflow to increase memory consumption when needed and tries to use until 100% of GPU memory.

The problem was that I was running the eval.py script at the same time (as recommended in their tutorial) and it was using parte of the GPU memory. When the train.py script tried to use all 100%, the error was thrown.

I solved it by settings the maximum use percentage to 70% for the training proccess. It also solved the problem of stuttering while training. This may not be the optimum value for my GPU, but it is configurable using config.gpu_options.per_process_gpu_memory_fraction = 0.7 setting, for example.