Search code examples
pythonpytorchobject-detection

Wrong "-1 background" annotations loaded from Custom COCO Dataset using Mmdetection


Introduction

I'm working using Mmdetection to train a Deformable DETR model using a custom COCO Dataset. Meaning a Custom Dataset using the COCO format of annotations. The dataset uses the same images as the COCO with different "toy" annotations for a "playground" experiment and the annotation file was created using the packages pycocotools and json exclusively.

I have made five variations of this playground dataset: 2 datasets with three classes (classes 1, 2, and 3), 1 dataset with six classes (classes 1 to 6) and 2 datasets with 7 classes (classes 1 to 7).

The Problem

Now, after creating the dataset in mmdetection using mmdet.datasets.build_dataset, I used the following code to check if everything was OK:

from pycocotools.coco import COCO
from os import path as osp
from mmdet.datasets import build_dataset

cfg = start_config() # this is simply a function to startup the config file
ann_file = osp.join(cfg.data.train.data_root, cfg.data.train.ann_file)
coco = COCO(ann_file)
img_ids = coco.getImgIds()
ann_ids = coco.getAnnIds(imgIds=img_ids)
anns = coco.loadAnns(ids=ann_ids)

cats_counter = {}
for ann in anns:
  if ann['category_id'] in cats_counter:
    cats_counter[ann['category_id']]+=1
  else:
    cats_counter[ann['category_id']] = 1
print(cats_counter)

cats = {cat['id']:cat for cat in coco.loadCats(coco.getCatIds())}
for i in range(len(cats_counter)):
  print("{} ({}) \t|\t{}".format(i, cats[i]['name'], cats_counter[i]))

ds = build_dataset(cfg.data.train)
print(ds)

For three of the datasets the amounts from the json file and from the constructed mmdet dataset are almost exactly equal. However, for one of the 3-classes dataset and for the 6-classes dataset, the results are incredibly different, where this code returns the following:

{3: 1843, 1: 659, 4: 1594, 2: 582, 0: 1421, 5: 498}
0 (1)   |   1421
1 (2)   |   659
2 (3)   |   582
3 (4)   |   1843
4 (5)   |   1594
5 (6)   |   498
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!

CocoDataset Train dataset with number of images 1001, and instance counts: 
+---------------+-------+---------------+-------+---------------+-------+---------------+-------+---------------+-------+
| category      | count | category      | count | category      | count | category      | count | category      | count |
+---------------+-------+---------------+-------+---------------+-------+---------------+-------+---------------+-------+
| 0 [1]         | 1421  | 1 [2]         | 659   | 2 [3]         | 581   | 3 [4]         | 1843  | 4 [5]         | 1594  |
|               |       |               |       |               |       |               |       |               |       |
| 5 [6]         | 0     | -1 background | 45    |               |       |               |       |               |       |
+---------------+-------+---------------+-------+---------------+-------+---------------+-------+---------------+-------+

and

{1: 1420, 0: 4131, 2: 1046}
0 (1)   |   4131
1 (2)   |   1420
2 (3)   |   1046
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!

CocoDataset Train dataset with number of images 1001, and instance counts: 
+----------+-------+------------+-------+----------+-------+---------------+-------+----------+-------+
| category | count | category   | count | category | count | category      | count | category | count |
+----------+-------+------------+-------+----------+-------+---------------+-------+----------+-------+
|          |       |            |       |          |       |               |       |          |       |
| 0 [1]    | 1419  | 1 [2]      | 0     | 2 [3]    | 0     | -1 background | 443   |          |       |
+----------+-------+------------+-------+----------+-------+---------------+-------+----------+-------+

You can see that there is no "-1" id in the annotation json, and also some of the classes from the 3-classes dataset have 0 annotations, while the json clearly shows more than that. Has anyone encountered something similar using Mmdetection? What could be causing this problem?


Solution

  • There was a mismatch between the classes names in the annotation file and the classes names in the mmdetection config object. Correcting those solved the problem.