Search code examples
pythonscikit-image

Extracting patches from 3D image in Python


I'm trying to extract patches from a 3D image as training data for a neural network. But am having trouble reshaping the patches for larger images. I'm currently using view_as_windows, but am open to other methods if they prove more useful.

An example of what my code would look like:

import numpy as np
from skimage.util import view_as_windows

kernel_size = 17
V = np.random.rand(150,150,150)
X = view_as_windows(V,(kernel_size,kernel_size,kernel_size),step=1)

This creates a numpy array that has the size(134,134,134,17,17,17). Now I would ideally like to reshape this to be of size (2406104,4913), but trying to reshape results in an allocation error:

X = X.reshape(134**3,17**3)
MemoryError: Unable to allocate 88.1 GiB for an array with shape (134, 134, 134, 17, 17, 17) and data type float64

Is there a way to reshape my patches or is there a better general way to go about this?


Solution

  • The problem is that there is no way to create your reshaped array without copying the data, so you need that much space. The naive option is to chunk or batch your data. Roughly (ignoring edge and overlap effects):

    xsize, ysize, zsize = V.shape
    xbatch, ybatch, zbatch = (34, 34, 34)
    batch_size = xbatch * ybatch * zbatch
    for i, j, k in itertools.product(
        range(xsize // xbatch), range(ysize // ybatch), range(zsize // sbatch)
    ):
        Xbatch = X[i * xbatch : (i+1) * xbatch,
                   j * ybatch : (j+1) * ybatch,
                   k * zbatch : (k+1) * zbatch]
        Xbatch_linear = Xbatch.reshape((batch_size, -1))
        # ... do your deep learning on the batch
    

    The longer answer is that what you are doing (iterating over all 17x17x17 patches) has a name in the field, called a convolution, and convolutional neural network does this for you, without creating expensive copies of the data. In short, using view_as_windows in this way is a neat little trick, and it is useful to understand the equivalence of this to convolutions, but it is not the right tool for this job. For that, you should use 3D convolutional layers in your deep learning library of choice.