I have implemented a COCO dataset as follows:
from torch.utils.data import Dataset
from detr.datasets.coco import CocoDetection
class MyCoco(CocoDetection):
def __init__(self,
img_folder,
ann_file,
transform=None) -> None:
super().__init__(img_folder, ann_file, transform, return_masks=True)
def __getitem__(self, idx):
img, target = super(MyCoco, self).__getitem__(idx)
return img, target
Then I defined a batch sampler and dataloader as follows:
my_coco = MyCoco(
settings.datasets.img_folder,
settings.datasets.ann_file
)
sampler_train = torch.utils.data.RandomSampler(my_coco)
batch_sampler_train = torch.utils.data.BatchSampler(sampler_train,
batch_size=32,
drop_last=True)
data_loader_train = DataLoader(my_coco,
sampler=batch_sampler_train,
collate_fn=collate_fn,
num_workers=1)
When I try to iterate the loader there is an error:
for a in data_loader_train:
print(a)
break
TypeError: list indices must be integers or slices, not list
Looking into the functions themselves, for some reason the indexes are within another list, and i dont understand why, and more importantly, how to how to fix it:
The DataLoader class of PyTorch has many arguments, some of them are incompatible.
As you have already defined the batch sampler, you need to pass it as batch_sampler
instead of sampler
. The sampler expects to find indices, not batches of indices.
Then, the code would be:
data_loader_train = DataLoader(my_coco,
batch_sampler=batch_sampler_train,
collate_fn=collate_fn,
num_workers=1)
Another way to do it is passing the sampler_train
sampler and the batch size as arguments of the DataLoader:
data_loader_train = DataLoader(my_coco,
sampler=sampler_train,
batch_size=32,
drop_last=True,
collate_fn=collate_fn,
num_workers=1)