I'm following this tutorial https://hackernoon.com/object-detection-in-google-colab-with-custom-dataset-5a7bb2b0e97e from RomRoc, and I am running into a problem from lookup_ops.py (for Python 2.7 from TensorFlow 1.12). Does anyone know how this part of TensorFlow works and can suggest what might be wrong? Here is the call to model_main.py
that is throwing the error,
!python ~/models/research/object_detection/model_main.py \
--pipeline_config_path=/root/models/research/object_detection/samples/configs/faster_rcnn_inception_v2_pets.config \
--model_dir=/root/datalab/trained \
--alsologtostderr \
--num_train_steps=3000 \
--num_eval_steps=500
And the stack trace,
/root/datalab
/root/models/research/object_detection/utils/visualization_utils.py:26: UserWarning:
This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.
The backend was *originally* set to 'module://ipykernel.pylab.backend_inline' by the following code:
File "/root/models/research/object_detection/model_main.py", line 26, in <module>
from object_detection import model_lib
File "/root/models/research/object_detection/model_lib.py", line 27, in <module>
from object_detection import eval_util
File "/root/models/research/object_detection/eval_util.py", line 27, in <module>
from object_detection.metrics import coco_evaluation
File "/root/models/research/object_detection/metrics/coco_evaluation.py", line 20, in <module>
from object_detection.metrics import coco_tools
File "/root/models/research/object_detection/metrics/coco_tools.py", line 47, in <module>
from pycocotools import coco
File "/usr/local/lib/python2.7/dist-packages/pycocotools/coco.py", line 49, in <module>
import matplotlib.pyplot as plt
File "/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py", line 72, in <module>
from matplotlib.backends import pylab_setup
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/__init__.py", line 14, in <module>
line for line in traceback.format_stack()
import matplotlib; matplotlib.use('Agg') # pylint: disable=multiple-statements
WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
WARNING:tensorflow:Estimator's model_fn (<function model_fn at 0x7f0dbdee1230>) includes params argument, but params are not passed to Estimator.
/root/models/research/object_detection/utils/label_map_util.py:138: RuntimeWarning: Unexpected end-group tag: Not all data was converted
label_map.ParseFromString(label_map_string)
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /root/models/research/object_detection/builders/dataset_builder.py:80: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.experimental.parallel_interleave(...)`.
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/sparse_ops.py:1165: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
Traceback (most recent call last):
File "/root/models/research/object_detection/model_main.py", line 109, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/root/models/research/object_detection/model_main.py", line 105, in main
tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 471, in train_and_evaluate
return executor.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 610, in run
return self.run_local()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 711, in run_local
saving_listeners=saving_listeners)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1207, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1234, in _train_model_default
input_fn, model_fn_lib.ModeKeys.TRAIN))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1075, in _get_features_and_labels_from_input_fn
self._call_input_fn(input_fn, mode))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1162, in _call_input_fn
return input_fn(**kwargs)
File "/root/models/research/object_detection/inputs.py", line 488, in _train_input_fn
batch_size=params['batch_size'] if params else train_config.batch_size)
File "/root/models/research/object_detection/builders/dataset_builder.py", line 145, in build
num_parallel_calls=num_parallel_calls)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1040, in map
return ParallelMapDataset(self, map_func, num_parallel_calls)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 2649, in __init__
use_inter_op_parallelism)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 2611, in __init__
map_func, "Dataset.map()", input_dataset)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1860, in __init__
self._function.add_to_graph(ops.get_default_graph())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 479, in add_to_graph
self._create_definition_if_needed()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 335, in _create_definition_if_needed
self._create_definition_if_needed_impl()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 344, in _create_definition_if_needed_impl
self._capture_by_value, self._caller_device)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py", line 864, in func_graph_from_py_func
outputs = func(*func_graph.inputs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1794, in tf_data_structured_function_wrapper
ret = func(*nested_args)
File "/root/models/research/object_detection/builders/dataset_builder.py", line 127, in process_fn
processed_tensors = decoder.decode(value)
File "/root/models/research/object_detection/data_decoders/tf_example_decoder.py", line 363, in decode
tensors = decoder.decode(serialized_example, items=keys)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/data/tfexample_decoder.py", line 525, in decode
outputs.append(handler.tensors_to_item(keys_to_tensors))
File "/root/models/research/object_detection/data_decoders/tf_example_decoder.py", line 117, in tensors_to_item
item = self._handler.tensors_to_item(keys_to_tensors)
File "/root/models/research/object_detection/data_decoders/tf_example_decoder.py", line 86, in tensors_to_item
return tf.maximum(self._name_to_id_table.lookup(unmapped_tensor),
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/lookup_ops.py", line 224, in lookup
(self._key_dtype, keys.dtype))
TypeError: Signature mismatch. Keys must be dtype <dtype: 'float32'>, got <dtype: 'string'>.
Here is the code from the tutorial: https://github.com/RomRoc/objdet_train_tensorflow_colab/blob/master/objdet_custom_tf_colab.ipynb
I followed the tutorial exactly but used my own training data. I had to make some changes to create_pet_tf_record.py
to get that part to work. Can you help me understand how to debug this? Thank you!
Here is what the value of keys
is which is passed to the function lookup()
in lookup_ops.py
that is causing the problem:
Tensor("SparseToDense:0", shape=(?,), dtype=string, device=/device:CPU:0)
EDIT: Could it be because it should be using GPU? GPU was detected...
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 6030853355760210263, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 8061591699314310064
physical_device_desc: "device: XLA_CPU device", name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 1730080795414704681
physical_device_desc: "device: XLA_GPU device", name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 11281553818
locality {
bus_id: 1
links {
}
}
incarnation: 5476104669001907939
physical_device_desc: "device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7"]
Got it!
In that tutorial, the label_map.pbtxt is supposed to get created with this line,
echo "item {\n id: 1\n name: 'dog'\n}" > label_map.pbtxt
But that put the \n in as literals. When I make my own label_map.pbtxt in Vim it worked (training).