Search code examples
pythonpytorchcomputer-visiondata-augmentationdetectron

How to use detectron2's augmentation with datasets loaded using register_coco_instances


I've trained a detectron2 model on custom data I labeled and exported in the coco format, but I now want to apply augmentation and train using the augmented data. How can I do that if I'm not using a custom DataLoader, but the register_coco_instances function.

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(im)

train_annotations_path = "./data/cvat-corn-train-coco-1.0/annotations/instances_default.json"
train_images_path = "./data/cvat-corn-train-coco-1.0/images"
validation_annotations_path = "./data/cvat-corn-validation-coco-1.0/annotations/instances_default.json"
validation_images_path = "./data/cvat-corn-validation-coco-1.0/images"

register_coco_instances(
    "train-corn",
    {},
    train_annotations_path,
    train_images_path
)
register_coco_instances(
    "validation-corn",
    {},
    validation_annotations_path,
    validation_images_path
)
metadata_train = MetadataCatalog.get("train-corn")
dataset_dicts = DatasetCatalog.get("train-corn")

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("train-corn",)
cfg.DATASETS.TEST = ("validation-corn",)
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")  # Let training initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 10000
cfg.SOLVER.STEPS = []
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 4
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
trainer.train()

I saw in the documentation you can load a dataset and apply augmentation like this:

dataloader = build_detection_train_loader(cfg,
   mapper=DatasetMapper(cfg, is_train=True, augmentations=[
      T.Resize((800, 800))
   ]))

But I'm not using a custom dataloader, what is the best approach to do this?


Solution

  • From my experience, how you register your datasets (i.e., tell Detectron2 how to obtain a dataset named "my_dataset") has no bearing on what dataloader to use during training (i.e., how to load information from a registered dataset and process it into a format needed by the model).

    So, you can register your dataset however you want - either by using the register_coco_instances function or by using the dataset APIs (DatasetCatalog, MetadataCatalog) directly; it doesn't matter. What matters is that you want to apply some transformation(s) during the data loading part.

    Basically, you want to customise the data loading part which can only be achieved by using a custom dataloader (unless you perform offline augmentation which is likely not what you want).

    Now, you don't need to define and use a custom dataloader directly in your top-level code. You can just create your own trainer deriving from DefaultTrainer, and override its build_train_loader method. This is as simple as the following.

    class MyTrainer(DefaultTrainer):
    
        @classmethod
        def build_train_loader(cls, cfg):
            mapper = DatasetMapper(cfg, is_train=True, augmentations=[T.Resize((800, 800))])
            return build_detection_train_loader(cfg, mapper=mapper)
    

    In your top-level code then, the only change required would be to use MyTrainer instead of DefaultTrainer.

    trainer = MyTrainer(cfg) 
    trainer.resume_or_load(resume=False)
    trainer.train()