I would like to add thousands of 4D arrays element wise and accounting for nans. A simple example using 1D arrays would be:
X = array([4,7,89,nan,89,65, nan])
Y = array([0,5,4, 9, 8, 100,nan])
z = X+Y
print z = array([4,12,93,9,97,165,nan])
I've written a simple for loop around this but it takes forever - not a smart solution. Another solution could be creating a larger array and use bottleneck nansum but this would take too much memory for my laptop. I need a running sum over 11000 cases.
Does anyone have a smart and fast way to do this?
Here is one possibility:
>>> x = np.array([1, 2, np.nan, 3, np.nan, 4])
... y = np.array([1, np.nan, 2, 5, np.nan, 8])
>>> x = np.ma.masked_array(np.nan_to_num(x), mask=np.isnan(x) & np.isnan(y))
>>> y = np.ma.masked_array(np.nan_to_num(y), mask=x.mask)
>>> (x+y).filled(np.nan)
array([ 2., 2., 2., 8., nan, 12.])
The real difficulty is that you seem to want nan
to be interpreted as zero unless all values at a particular position are nan
. This means that you must look at both x and y to determine which nans to replace. If you are okay with having all nan values replaced, then you can simply do np.nan_to_num(x) + np.nan_to_num(y)
.