Search code examples
pytorchlabeldataloader

Problem with extract label from my dataset


I have an image dataset with 35 classes, all the images are in one folder, and one part of the names of the images is their label. An example of image names is like this: D34_Samsung_GalaxyS3Mini-images-flat-D01_I_flat_0001.jpg And the label becomes D01 here.

In the Dataset class definition, the target variable should return the image label, right? If we consider the index, 34 should be returned for this example.

I have a code to define the dataset:

class MyDataset(Dataset):
    
    def __init__(self, imgs , transform = None):
        self.imgs = imgs
        self.transform = transform or transforms.ToTensor()
        self.class_to_idx = {}

    def __getitem__(self, index):
        
        image_path = self.imgs[index]
        target = image_path.split('_')[0]
        target = re.findall(r'D\d+.+' , target)
        
        image = Image.open(image_path)
        
        if self.transform is not None:
            image = self.transform(image)

        if target[0] in self.class_to_idx : 
            target = [self.class_to_idx[target[0]]]
        else : 
            self.class_to_idx[target[0]] = len(self.class_to_idx)
            target = [self.class_to_idx[target[0]]]

        return image , target
    
    def __len__(self):
        return len(self.imgs)

But when I tested it, I realized it does not extract the labels correctly. That is, every time, the labels are always a number between 0 and 15 (batch-size=16). There are 35 classes, but the target is always between 0 and 15. that is, the batch size; also, an image may get a different label each time the code is executed!

output of the above code

So I changed the Dataset code. I removed a few lines of code and directly obtained the label from the name of the images instead of using class_to_idx:

class MyDataset(Dataset):
    
    def __init__(self, imgs , transform = None):
        self.imgs = imgs
        self.transform = transform or transforms.ToTensor()
        self.class_to_idx = {}

    def __getitem__(self, index):
        
        image_path = self.imgs[index]
        target = image_path.split('_')[0]
        target = target.split('D')[1]
        target = int(target)
        
        image = Image.open(image_path)
        
        if self.transform is not None:
            image = self.transform(image)

        return image , target
    
    def __len__(self):
        return len(self.imgs)

When I did the test, the numbers were no longer between 0 and 15, and there were real labels of images: output of the changed code

My problem is that when I train the model with the first code, my CNN model trains correctly and does not give an error. But by the second code (my edition), even though the output was correct in the test, the model cannot train and errors:

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 1 criterion , optimizer , scheduler = lossAndOptim(model = model) 2 ----> 3 losses_val , losses_trn , accs_val , accs_trn = train_model(model, 4 train_dl, valid_dl, 5 criterion, optimizer,

4 frames /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing) 3024 if size_average is not None or reduce is not None: 3025 reduction = _Reduction.legacy_get_string(size_average, reduce) -> 3026 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) 3027 3028

ValueError: Expected input batch_size (16) to match target batch_size (0).

Whatever I search, the answers I see are related to the model. But there is no problem with the model, and it does not give an error with the first code. Thank you for your advice.

I edit the source code and I googled but the answers were not related to my problem.


Solution

  • I changed the code as below, and my problem was solved: There was a need to return "target" as a dictionary.

    class MyDataset(Dataset):
        
        def __init__(self, imgs , transform = None):
            self.imgs = imgs
            self.transform = transform or transforms.ToTensor()
            self.class_to_idx = {}
    
        def __getitem__(self, index):
            
            image_path = self.imgs[index]
            target = image_path.split('-')[0]
            label = target.split('_')[0]
            label = label.split('D')[1]
            name = target
            
            image = Image.open(image_path)
            
            if self.transform is not None:
                image = self.transform(image)
    
            if target in self.class_to_idx : 
                target = [self.class_to_idx[target]]
            else : 
                self.class_to_idx[target] = (int(label)-1)
                target = [self.class_to_idx[target]]
    
            return image , target
        
        def __len__(self):
            return len(self.imgs)