python tensorflow google-colaboratory object-detection-api

Colab tutorial transfer learning using object_detection, why TypeError from lookup_ops.py?

I'm following this tutorial https://hackernoon.com/object-detection-in-google-colab-with-custom-dataset-5a7bb2b0e97e from RomRoc, and I am running into a problem from lookup_ops.py (for Python 2.7 from TensorFlow 1.12). Does anyone know how this part of TensorFlow works and can suggest what might be wrong? Here is the call to model_main.py that is throwing the error,

!python ~/models/research/object_detection/model_main.py \
    --pipeline_config_path=/root/models/research/object_detection/samples/configs/faster_rcnn_inception_v2_pets.config \
    --model_dir=/root/datalab/trained \
    --alsologtostderr \
    --num_train_steps=3000 \
    --num_eval_steps=500

And the stack trace,

/root/datalab
/root/models/research/object_detection/utils/visualization_utils.py:26: UserWarning: 
This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.

The backend was *originally* set to 'module://ipykernel.pylab.backend_inline' by the following code:
  File "/root/models/research/object_detection/model_main.py", line 26, in <module>
    from object_detection import model_lib
  File "/root/models/research/object_detection/model_lib.py", line 27, in <module>
    from object_detection import eval_util
  File "/root/models/research/object_detection/eval_util.py", line 27, in <module>
    from object_detection.metrics import coco_evaluation
  File "/root/models/research/object_detection/metrics/coco_evaluation.py", line 20, in <module>
    from object_detection.metrics import coco_tools
  File "/root/models/research/object_detection/metrics/coco_tools.py", line 47, in <module>
    from pycocotools import coco
  File "/usr/local/lib/python2.7/dist-packages/pycocotools/coco.py", line 49, in <module>
    import matplotlib.pyplot as plt
  File "/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py", line 72, in <module>
    from matplotlib.backends import pylab_setup
  File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/__init__.py", line 14, in <module>
    line for line in traceback.format_stack()


  import matplotlib; matplotlib.use('Agg')  # pylint: disable=multiple-statements
WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
WARNING:tensorflow:Estimator's model_fn (<function model_fn at 0x7f0dbdee1230>) includes params argument, but params are not passed to Estimator.
/root/models/research/object_detection/utils/label_map_util.py:138: RuntimeWarning: Unexpected end-group tag: Not all data was converted
  label_map.ParseFromString(label_map_string)
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /root/models/research/object_detection/builders/dataset_builder.py:80: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.parallel_interleave(...)`.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/sparse_ops.py:1165: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
Traceback (most recent call last):
  File "/root/models/research/object_detection/model_main.py", line 109, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "/root/models/research/object_detection/model_main.py", line 105, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 471, in train_and_evaluate
    return executor.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 610, in run
    return self.run_local()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 711, in run_local
    saving_listeners=saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 354, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1207, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1234, in _train_model_default
    input_fn, model_fn_lib.ModeKeys.TRAIN))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1075, in _get_features_and_labels_from_input_fn
    self._call_input_fn(input_fn, mode))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1162, in _call_input_fn
    return input_fn(**kwargs)
  File "/root/models/research/object_detection/inputs.py", line 488, in _train_input_fn
    batch_size=params['batch_size'] if params else train_config.batch_size)
  File "/root/models/research/object_detection/builders/dataset_builder.py", line 145, in build
    num_parallel_calls=num_parallel_calls)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1040, in map
    return ParallelMapDataset(self, map_func, num_parallel_calls)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 2649, in __init__
    use_inter_op_parallelism)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 2611, in __init__
    map_func, "Dataset.map()", input_dataset)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1860, in __init__
    self._function.add_to_graph(ops.get_default_graph())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 479, in add_to_graph
    self._create_definition_if_needed()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 335, in _create_definition_if_needed
    self._create_definition_if_needed_impl()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 344, in _create_definition_if_needed_impl
    self._capture_by_value, self._caller_device)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 864, in func_graph_from_py_func
    outputs = func(*func_graph.inputs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1794, in tf_data_structured_function_wrapper
    ret = func(*nested_args)
  File "/root/models/research/object_detection/builders/dataset_builder.py", line 127, in process_fn
    processed_tensors = decoder.decode(value)
  File "/root/models/research/object_detection/data_decoders/tf_example_decoder.py", line 363, in decode
    tensors = decoder.decode(serialized_example, items=keys)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/data/tfexample_decoder.py", line 525, in decode
    outputs.append(handler.tensors_to_item(keys_to_tensors))
  File "/root/models/research/object_detection/data_decoders/tf_example_decoder.py", line 117, in tensors_to_item
    item = self._handler.tensors_to_item(keys_to_tensors)
  File "/root/models/research/object_detection/data_decoders/tf_example_decoder.py", line 86, in tensors_to_item
    return tf.maximum(self._name_to_id_table.lookup(unmapped_tensor),
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/lookup_ops.py", line 224, in lookup
    (self._key_dtype, keys.dtype))
TypeError: Signature mismatch. Keys must be dtype <dtype: 'float32'>, got <dtype: 'string'>.

Here is the code from the tutorial: https://github.com/RomRoc/objdet_train_tensorflow_colab/blob/master/objdet_custom_tf_colab.ipynb

I followed the tutorial exactly but used my own training data. I had to make some changes to create_pet_tf_record.py to get that part to work. Can you help me understand how to debug this? Thank you!

Here is what the value of keys is which is passed to the function lookup() in lookup_ops.py that is causing the problem:

Tensor("SparseToDense:0", shape=(?,), dtype=string, device=/device:CPU:0)

EDIT: Could it be because it should be using GPU? GPU was detected...

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 6030853355760210263, name: "/device:XLA_CPU:0"
 device_type: "XLA_CPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 8061591699314310064
 physical_device_desc: "device: XLA_CPU device", name: "/device:XLA_GPU:0"
 device_type: "XLA_GPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 1730080795414704681
 physical_device_desc: "device: XLA_GPU device", name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 11281553818
 locality {
   bus_id: 1
   links {
   }
 }
 incarnation: 5476104669001907939
 physical_device_desc: "device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7"]

Solution

Got it!

In that tutorial, the label_map.pbtxt is supposed to get created with this line,

echo "item {\n id: 1\n name: 'dog'\n}" > label_map.pbtxt

But that put the \n in as literals. When I make my own label_map.pbtxt in Vim it worked (training).