Context: I have a 3D numpy array (3,10,10): arr
. I have a list gathering indices of the 0th dimension of arr
: grouped_indices
. I want to calculate the sum of these grouped indices of arr
, and store them in a host array: host_arr
.
Problem: I am using np.nansum()
, however a sum of two NaNs gives me a 0 and I would like it to return a NaN. I don't want to set all the zeros to NaNs once I have calculated the sum.
Question: How can I calculate the nansum of n 2D arrays (same shape), but set as NaN any cell for which all the arrays have a NaN ?
Example:
import numpy as np
import matplotlib.pyplot as plt
# Generate example data
np.random.seed(0)
arr_shape = (10, 10)
num_arrays = 3
# Create a 3D numpy array with random values
arr = np.random.rand(num_arrays, *arr_shape)
# Introduce NaNs
arr[0, :5, :5] = np.nan
arr[1, 2:7, 2:7] = np.nan
arr[2] = np.nan
arr[2, :2, :2] = 10
# Generate a list of arrays containing indices of the 0th dimension of arr
grouped_indices = [np.array([0,1]), np.array([0,1,2])]
# Create a host array that is the sum of grouped_indices slices
host_arr = np.array([np.nansum(arr[indices], axis=0) for indices in grouped_indices])
# Plot the nansums
plt.figure()
plt.imshow(host_arr[0]) # indices [2:5, 2:5] should be NaNs
plt.colorbar()
plt.figure()
plt.imshow(host_arr[1]) # indices [2:5, 2:5] should be NaNs too
plt.colorbar()
IIUC, you can use np.all
+ np.isnan
to detect where you have all NaN
s and set these values to NaN
after np.nansum
explicitly
host_arr = np.array([np.nansum(arr[indices], axis=0) for indices in grouped_indices])
host_arr[0][np.all(np.isnan(arr[grouped_indices[0]]), axis=0)] = np.nan
host_arr[1][np.all(np.isnan(arr[grouped_indices[1]]), axis=0)] = np.nan
Then the result is: