Search code examples
pythonmachine-learningtensorflowobject-detection

How to get the category of the object detected - tensorflow


I'm new to tensorflow and I'm wondering how do you get the 'category' of the object detected back along with its score?

Here'a a score array I have for a sample image, as you can see there's 4 'boxes' with over 50% confidence:

    scores
Out[104]: 
array([[ 0.86274904,  0.68427145,  0.53348649,  0.51449829,  0.4737072 ,
         0.20210901,  0.18676876,  0.15660423,  0.15557742,  0.15542269,
         0.15463693,  0.14486608,  0.13966955,  0.1298867 ,  0.12409254,
         0.12288965,  0.10780571,  0.10559474,  0.10537479,  0.10148374,
         0.10037792,  0.09754734,  0.09640686,  0.09590817,  0.09435588,
         0.09198856,  0.0918662 ,  0.09173656,  0.0879355 ,  0.08762371,
         0.08711197,  0.08627746,  0.08621041,  0.08481266,  0.08390592,
         0.08365501,  0.08259587,  0.08251529,  0.08152138,  0.08019584,
         0.07996205,  0.07962734,  0.07907323,  0.07718709,  0.07715672,
         0.07693762,  0.07692765,  0.07642546,  0.07611038,  0.07582222,
         0.0754476 ,  0.07542069,  0.07490928,  0.07476845,  0.0747645 ,
         0.07418264,  0.07383376,  0.07275221,  0.07237192,  0.0722541 ,
         0.07183521,  0.07175662,  0.07174246,  0.07155806,  0.071283  ,
         0.0710848 ,  0.07026858,  0.06924678,  0.06890308,  0.06833564,
         0.06827622,  0.06769758,  0.06753176,  0.06721075,  0.0663776 ,
         0.06553975,  0.06466822,  0.06375053,  0.06349288,  0.0633459 ,
         0.06320453,  0.06309631,  0.06307632,  0.06258182,  0.06233004,
         0.06231011,  0.06228941,  0.06161467,  0.06125913,  0.06117567,
         0.06101252,  0.06089024,  0.0608751 ,  0.06063354,  0.06047466,
         0.06046106,  0.0603817 ,  0.06035899,  0.06034132,  0.06016544]], dtype=float32)

What I'm wondering is how you get the output of what each of these scores is categorizing the object as, I'm can't seem to see it in any of the variables. Here's a list of what is returned:

    print(locals().keys())
dict_keys(['__name__', '__builtin__', '__builtins__', '_ih', '_oh', '_dh', 'In', 'Out', 'get_ipython', 'exit', 'quit', '_i', '_ii', '_iii', '_i52', 'np', 'os', 'urllib', 'sys', 'tarfile', 'tf', 'zipfile', 'defaultdict', 'StringIO', 'plt', 'Image', 'total_count', 'label_map_util', 'vis_util', 'MODEL_NAME', 'MODEL_FILE', 'DOWNLOAD_BASE', 'PATH_TO_CKPT', 'PATH_TO_LABELS', 'NUM_CLASSES', 'opener', 'tar_file', 'file', 'file_name', 'detection_graph', 'od_graph_def', 'fid', 'serialized_graph', 'label_map', 'categories', 'category_index', 'load_image_into_numpy_array', 'PATH_TO_TEST_IMAGES_DIR', 'TEST_IMAGE_PATHS', 'IMAGE_SIZE', 'sess', 'image_tensor', 'detection_boxes', 'detection_scores', 'detection_classes', 'num_detections', 'image_path', 'image', 'image_np', 'image_np_expanded', 'boxes', 'scores', 'classes', 'num',

This is using the primary tutorial script from the object_detection tutorial on github found here: https://github.com/tensorflow/models/tree/master/research/object_detection

Any help would be appreciated. Cheers,


Solution

  • I assume you're talking about this notebook.

    I looked at the code and the scores array indicate the confidence level for each detected box. The class of each box is in the classes variable (detection_classes:0 tensor).

    This line runs the input through the network:

    (boxes, scores, classes, num) = sess.run(
              [detection_boxes, detection_scores, detection_classes, num_detections],
              feed_dict={image_tensor: image_np_expanded})
    

    In order to print the class along with the scores you can:

    for class, score in zip(classes, scores):
        print(class, ':', score)
    

    The descriptions of categories used in the notebook are here (mscoco labels).