python pytorch dataloader pytorch-dataloader

Pytorch DataLoader for custom dataset to load data image and mask correctly with a window size

I am trying to write a custom data loader for a dataset where the directory structures is as follows:

All_data
|
->Numpy_dat
| |
|  -> dat_0
|      -> dat_{0}_{0}.npy
|      .
|      .
| -> dat_1
|      -> dat_{0}_{0}.npy
|      -> dat_{0}_{1}.npy
|      .
|      .
|->mask_numpy
  |
  -> mask_0
     -> mask_{0}_{0}.npy
     -> mask_{0}_{1}.npy
     .
     .
  -> mask_1
     -> mask_{0}_{0}.npy
     -> mask_{0}_{1}.npy
     .
     .

Opening this thread to understand how to download these patches with overlapping windows and keep track of the main folder i.e. dat_{z}.

The tensors in the sub-folder are non-overlapping patches coming from the same image. I simply extracted them based on some start points which are stored in a file called starts.npy. Essentially the patch is square but the size differs between rows. So suppose the original image is of the size 254x254 I have a list of start numbers 0, 50, 40, 30... so the first patch is a 50x50 for row 1 and column 1 but row 1 and column 2 is 40x40 and so on and so forth.

Solution

I don't fully grasp the tiling strategy you used so here is a simple example that may help you understand how to do it in your case.

Say that you tiled each image with patches of size (PW x PH) = 3x2 (width x height) and your image size is divisible by the patch size, say a 6x8 image. This means that an image is composed of NB_PW x NB_PH = 2x4 = 8 patches.

Then, you would store the 8 patches in your storage with names from dat_0_0.npy to dat_1_3.npy where the first number is the width index and the second one the height index.

You would first read all the patches in memory as such, in row-major order (this means that the width dimension index varies first):

import numpy as np
import torch

def read_patches(folder: str, nb_pw: int, nb_ph: int) -> list[torch.Tensor]:
    patches = []

    for pw_idx in range(nb_ph):
        for ph_idx in range(nb_pw):
            data = np.load(f"{folder}/dat_{pw_idx}_{ph_idx}.npy")
            patch = torch.from_numpy(data)
            patches.append(patch)

    return patches

Then, you can reassemble the patches into one image by creating an empty tensor of the appropriate size and pasting the patches into it:

import torch

PW = 3
PH = 2
NB_PW = 2
NB_PH = 4


def reassemble_patches(
    patches: list[torch.Tensor], pw: int, ph: int, nb_pw: int, nb_ph: int
) -> torch.Tensor:
    assert (
        len(patches) == nb_pw * nb_ph
    ), f"the number of provides patches ({len(patches)}) does not match the \
        expected number `nb_pw`x`nb_ph` of patches ({nb_pw * nb_ph})"

    output = torch.empty(size=(nb_ph * ph, nb_pw * pw))

    for pw_idx in range(nb_pw):
        pw_i_start = pw_idx * pw
        pw_i_end = (pw_idx + 1) * pw

        for ph_idx in range(nb_ph):
            ph_i_start = ph_idx * ph
            ph_i_end = (ph_idx + 1) * ph

            patch = patches[pw_idx + nb_pw * ph_idx]

            output[ph_i_start:ph_i_end, pw_i_start:pw_i_end] = patch

    return output

Finally, the main program:

def main():
    patches = read_patches("patches/folder/", NB_PW, NB_PH)
    full = reassemble_patches(patches, PW, PH, NB_PW, NB_PH)
    print(full)


if __name__ == "__main__":
    main()