Search code examples
pythonpytorchdataloaderpytorch-dataloader

Pytorch DataLoader for custom dataset to load data image and mask correctly with a window size


I am trying to write a custom data loader for a dataset where the directory structures is as follows:

All_data
|
->Numpy_dat
| |
|  -> dat_0
|      -> dat_{0}_{0}.npy
|      .
|      .
| -> dat_1
|      -> dat_{0}_{0}.npy
|      -> dat_{0}_{1}.npy
|      .
|      .
|->mask_numpy
  |
  -> mask_0
     -> mask_{0}_{0}.npy
     -> mask_{0}_{1}.npy
     .
     .
  -> mask_1
     -> mask_{0}_{0}.npy
     -> mask_{0}_{1}.npy
     .
     .

Opening this thread to understand how to download these patches with overlapping windows and keep track of the main folder i.e. dat_{z}.

The tensors in the sub-folder are non-overlapping patches coming from the same image. I simply extracted them based on some start points which are stored in a file called starts.npy. Essentially the patch is square but the size differs between rows. So suppose the original image is of the size 254x254 I have a list of start numbers 0, 50, 40, 30... so the first patch is a 50x50 for row 1 and column 1 but row 1 and column 2 is 40x40 and so on and so forth.


Solution

  • I don't fully grasp the tiling strategy you used so here is a simple example that may help you understand how to do it in your case.

    Say that you tiled each image with patches of size (PW x PH) = 3x2 (width x height) and your image size is divisible by the patch size, say a 6x8 image. This means that an image is composed of NB_PW x NB_PH = 2x4 = 8 patches.

    Then, you would store the 8 patches in your storage with names from dat_0_0.npy to dat_1_3.npy where the first number is the width index and the second one the height index.

    You would first read all the patches in memory as such, in row-major order (this means that the width dimension index varies first):

    import numpy as np
    import torch
    
    def read_patches(folder: str, nb_pw: int, nb_ph: int) -> list[torch.Tensor]:
        patches = []
    
        for pw_idx in range(nb_ph):
            for ph_idx in range(nb_pw):
                data = np.load(f"{folder}/dat_{pw_idx}_{ph_idx}.npy")
                patch = torch.from_numpy(data)
                patches.append(patch)
    
        return patches
    

    Then, you can reassemble the patches into one image by creating an empty tensor of the appropriate size and pasting the patches into it:

    import torch
    
    PW = 3
    PH = 2
    NB_PW = 2
    NB_PH = 4
    
    
    def reassemble_patches(
        patches: list[torch.Tensor], pw: int, ph: int, nb_pw: int, nb_ph: int
    ) -> torch.Tensor:
        assert (
            len(patches) == nb_pw * nb_ph
        ), f"the number of provides patches ({len(patches)}) does not match the \
            expected number `nb_pw`x`nb_ph` of patches ({nb_pw * nb_ph})"
    
        output = torch.empty(size=(nb_ph * ph, nb_pw * pw))
    
        for pw_idx in range(nb_pw):
            pw_i_start = pw_idx * pw
            pw_i_end = (pw_idx + 1) * pw
    
            for ph_idx in range(nb_ph):
                ph_i_start = ph_idx * ph
                ph_i_end = (ph_idx + 1) * ph
    
                patch = patches[pw_idx + nb_pw * ph_idx]
    
                output[ph_i_start:ph_i_end, pw_i_start:pw_i_end] = patch
    
        return output
    

    Finally, the main program:

    def main():
        patches = read_patches("patches/folder/", NB_PW, NB_PH)
        full = reassemble_patches(patches, PW, PH, NB_PW, NB_PH)
        print(full)
    
    
    if __name__ == "__main__":
        main()