Search code examples
pytorchimage-segmentationbatchsize

Implementing Batch for Image Segmentation


I wrote a Python 3.5 script for doing street segmentation. Since I'm new in Image Segementation, I did not use predefined dataloaders from pytorch, instead I wrote them by my self (for better understanding). Until now I only use a batch size of 1. Now I want to generalize this for arbitrary batch sizes.

This is a snippet of my Dataloader:

def augment_data(batch_size):


    # [...] defining some paths and data transformation (including ToTensor() function)

    # The images are named by numbers (Frame numbers), this allows me to find the correct label image for a given input image.
    all_input_image_paths = {int(elem.split('\\')[-1].split('.')[0]) : elem for idx, elem in enumerate(glob.glob(input_dir + "*"))}
    all_label_image_paths = {int(elem.split('\\')[-1].split('.')[0]) : elem for idx, elem in enumerate(glob.glob(label_dir + "*"))}


    dataloader = {"train":[], "val":[]}
    all_samples = []
    img_counter = 0

    for key, value in all_input_image_paths.items():
        input_img = Image.open(all_input_image_paths[key])
        label_img = Image.open(all_label_image_paths[key])

        # Here I use my own augmentation function which crops the input and label on the same position and do other things.
        # We get a list of new augmented data
        augmented_images = generate_augmented_images(input_img, label_img)
        for elem in augmented_images:
            input_as_tensor = data_transforms['norm'](elem[0])
            label_as_tensor = data_transforms['val'](elem[1])

            input_as_tensor.unsqueeze_(0)
            label_as_tensor.unsqueeze_(0)

            is_training_data = random.uniform(0.0, 1.0)
            if is_training_data <= 0.7:
                dataloader["train"].append([input_as_tensor, label_as_tensor])
            else:
                dataloader["val"].append([input_as_tensor, label_as_tensor])
            img_counter += 1

    shuffle(dataloader["train"])
    shuffle(dataloader["val"])
    dataloader_batched =  {"train":[], "val":[]}


    # Here I group my data to a given batch size
    for elem in dataloader["train"]:
        batch = []
        for i in range(batch_size):
            batch.append(elem)
        dataloader_batched["train"].append(batch)

    for elem in dataloader["val"]:
        batch = []
        for i in range(batch_size):
            batch.append(elem)
        dataloader_batched["val"].append(batch)

    return dataloader_batched

This is a snippet of my training method with batch size 1:

    while epoch <= num_epochs:
        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                scheduler.step(3)
                model.train()  # Set model to training mode
            else:
                model.eval()  # Set model to evaluate mode

            running_loss = 0.0

            counter = 0
            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                counter += 1
                max_num = len(dataloaders[phase])

                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)


            epoch_loss = running_loss / dataset_sizes[phase]

If I execute this, I get of course the error:

for inputs, labels in dataloaders[phase]:
ValueError: not enough values to unpack (expected 2, got 1)

I understand why, because now I have a list of images and not only an input and label image as before. So guessed I need a second for loop which iterates over these batches. So I tried this:

            # Iterate over data.
            for elem in dataloaders[phase]:
                for inputs, labels in elem:
                    counter += 1
                    max_num = len(dataloaders[phase])

                    inputs = inputs.to(device)
                    labels = labels.to(device)

                    # zero the parameter gradients
                    optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    # _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

But for me it looks like the optimization step (back-prop) is only applied on the last image of the batch. Is that true? And if so, how can I fix this? I guess if I indent the with-Block, then I get again a batch size 1 optimization.

Thanks in advance


Solution

  • But for me it looks like the optimization step (back-prop) is only applied on the last image of the batch.

    It should not apply only based on the last image. It should apply based on the batch size. If you set bs=2 and it should apply to the batch of two images.

    Optimization step actually will update the params of your network. Backprop is a fancy name for PyTorch autograd system that computes the first order gradients.