Search code examples
tensorflowgoogle-colaboratoryobject-detection-api

Colab kill '^C' - out of memory with small dataset and small batch_size?


I am trying to run a mask R-CNN architecture on a small dataset (for now the same as in the tutorial), following this tutorial.
I think I am running out of memory. This the given output:

[...]
Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.

tcmalloc: large alloc 644245094400 bytes == 0x1a936c000 @  0x7efe929e7b6b 0x7efe92a07379 
0x7efe76f65287 0x7efe6848dc0f 0x7efe6851644b 0x7efe68388cc6 0x7efe68389b8c 0x7efe68389e13 
0x7efe7182f2b7 0x7efe71853a35 0x7efe6860b53f 0x7efe686005c5 0x7efe686be551 0x7efe686bb263 
0x7efe686aa3a5 0x7efe923c96db 0x7efe92702a3f
^C

Is it right that my problem is the 12,5 GB RAM even that I have a relatively small dataset and batch_size = 1? Here is my Colab-Notebook.

What are the options to reduce the memory load apart from batch_size, shuffle_buffer_size?


Solution

  • I think your error comes from the fact that you have your mask size = (1024, 1024). The network will predict a few decades of masks for each images and select the best of them after. In this case if you have your image that's already of size (1024, 1024) and then several masks of the same size there is no way that your GPU have enough memory to keep them all.

    In the standard configuration the masks are of size :

    mask_height: 33
    mask_width: 33
    

    So I suggest you to change that.