Search code examples
pythonindexingnannumpy-ndarraynumpy-slicing

removing Nans from a 3D array without reshaping my data


I have a 3D array (121, 512, 1024) made up of frames of 512x1024 images. The bottom several rows of the images have Nans which mess up my processing. I want to remove these and end up with something that looks like (121, 495, 1024).

I have been trying variations of

arr[:, ~np.isnan(arr).any(0)]

but I end up with (121, 508252).

I also tried arr[np.argwhere(~np.isnan(arr)]

but I got a Memory error.

This seems like a simple and common task, but all the examples I have been able to find are for 2D arrays. Any help would be appreciated.

edit: Alternatively, if I do reshape, I need it to automatically detect what the new shape should be. Depending on the transformations on the images, trimming the Nans could lead to an array of different shapes (121, 495, 1000) or whatever. But in one image stack (for example the 121 frames) all images will be identical shapes, so making the array will be legal.

The problem is I cannot predict if the nans are in a straight line, at the top of the image or sides or bottom (but should be edges). I basically have to cut more into the image to get a straight row/column, I just need to figure out where the new edges should be.

arr = np.ones([121, 512, 1024])
arr[:,497:512,0:675] = np.nan
arr[:,496:512,676:1024] = np.nan

#try 3 
trim = np.argwhere(~np.isnan(arr))
rows = np.unique(trim[:,1])
cols = np.unique(trim[:,2])
result_arr = arr[:, ~np.isnan(arr).all(0)]
result_arr.shape = -1, len(rows), len(cols)

For example this does not work because there is a row mismatch at column 675, so I have the wrong number of cells for these dimensions.


Solution

  • Since you want to filter on the second dimension, you should aggregate your first and last dimension with any:

    out = arr[:, ~np.isnan(arr).any((0, 2))]
    
    out.shape # (121, 496, 1024)
    

    Alternatively:

    out = arr[:, ~np.isnan(arr).any(0).any(-1)]