Search code examples
pythontensorflowobject-detectiongoogle-colaboratory

Google Colab stops the cell that is running my train.py code seemingly random


Colab image I am using tensorflow 1.15.2 and training an object detection model in google colab. When I run the code -

!python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config

it will work as intended for a short period of time. I either get through only 200 iterations or it could go 7000 then show ^c at the end, signifying that it stopped the training.

I have heard about the session disconnecting so i set an auto clicker to keep the page active but it still stops. Any help with keeping it from stopping will greatly appreciated.

EDIT: here is the link to the notebook: https://colab.research.google.com/drive/1KkLSaJCoiN4P0HKg-oEMTG9kfSJl6BqM


Solution

  • I found that I was running out of memory, you need to force google colab to give you 25 gigs of ram instead of the 12 that it normally starts out with.