Search code examples
deep-learningcaffepycaffe

How Faster RCNN library load training dataset for training?


I use Faster RCNN library for Deep Learning and here is a discussion of how to train own dataset. That is one step ahead of me.

For me I like to understand how dataset for training is setup and how it is loaded for training?

When I look at the code, I saw this line of code

imdb = get_imdb(imdb_name) from train_faster_rcnn_alt_opt.py and it calls factory.py inside datasets folder.

factory.py has

for year in ['2007', '2012']:
    for split in ['train', 'val', 'trainval', 'test']:
        name = 'voc_{}_{}'.format(year, split)
        __sets[name] = (lambda split=split, year=year: pascal_voc(split, year))

# Set up coco_2014_<split>
for year in ['2014']:
    for split in ['train', 'val', 'minival', 'valminusminival']:
        name = 'coco_{}_{}'.format(year, split)
        __sets[name] = (lambda split=split, year=year: coco(split, year))

# Set up coco_2015_<split>
for year in ['2015']:
    for split in ['test', 'test-dev']:
        name = 'coco_{}_{}'.format(year, split)
        __sets[name] = (lambda split=split, year=year: coco(split, year))

def get_imdb(name):
    """Get an imdb (image database) by name."""
    if not __sets.has_key(name):
        raise KeyError('Unknown dataset: {}'.format(name))
    return __sets[name]()

def list_imdbs():
    """List all registered imdbs."""
    return __sets.keys()

I am wondering how training data for this imdb name voc_2007_trainval is loaded for training?

EDIT: When I print __sets[name]() inside def get_imdb(name):, I see the following.

p __sets[name]()
<datasets.pascal_voc.pascal_voc object at 0x7fc937383ed0>

What does that mean?


Solution

  • Now I understood. lib/datasets/factory.py has

    def get_imdb(name):
        """Get an imdb (image database) by name."""
        if not __sets.has_key(name):
            raise KeyError('Unknown dataset: {}'.format(name))
        return __sets[name]()
    

    __sets[name] calls

    for year in ['2007', '2012']:
        for split in ['train', 'val', 'trainval', 'test']:
            name = 'voc_{}_{}'.format(year, split)
            __sets[name] = (lambda split=split, year=year: pascal_voc(split, year))
    

    Since we set --imdb voc_2007_trainval in the training command, what it does is that the program can load the images stated inside trainval file in data/VOCdevkit2007/VOC2007/ImageSets/Main/trainval.txt file.

    If we set --imdb voc_2007_train then train.txt will be used. All images are inside JPEGImages folder and annotations are in Annotations folder.