I have the ocean and atmospheric dataset in netcdf file. Ocean data will contain nan
or any other value -999
over land area. For this eample, say it is nan
. Sample data will look like this:-
import numpy as np
ocean = np.array([[2, 4, 5], [6, np.nan, 2], [9, 3, np.nan]])
atmos = np.array([[4, 2, 5], [6, 7, 3], [8, 3, 2]])
Now I wanted to apply multiple conditions on ocean and atmos data to make a new array which will have only values from 1
to 8
. For example in ocean data, values between 2
and 4
will be assigned as 1
and values between 4
and 6
will be assigned as 2
. The same comparison goes to atmos dataset as well.
To simplify the comparison and assignment operation, I made a list of bin values and used np.digitize
to make categories.
bin1 = [2, 4, 6]
bin2 = [4, 6, 8]
ocean_cat = np.digitize(ocean, bin1)
atmos_cat = np.digitize(atmos, bin2)
which produces the following result:-
[[1 2 2]
[3 3 1]
[3 1 3]]
[[1 0 1]
[2 2 0]
[3 0 0]]
Now I wanted element-wise maximum between the above two array results. Therefore, I used np.fmax
to get the element-wise maximum.
final_cat = np.fmax(ocean_cat, atmos_cat)
print(final_cat)
which produces the below result:-
[[1 2 2]
[3 3 1]
[3 1 3]]
The above result is almost what I need. The only issue I find here is the missing nan
value. What I wanted in the final result is:-
[[1 2 2]
[3 nan 1]
[3 1 nan]]
Can someone help me to replace the values with nan from the same index of original ocean array?
A simple option would be to mask the output with numpy.where
:
bin1 = [2, 4, 6]
bin2 = [4, 6, 8]
ocean_cat = np.digitize(ocean, bin1)
atmos_cat = np.digitize(atmos, bin2)
final_cat = np.where(np.isnan(ocean), np.nan,
np.fmax(ocean_cat, atmos_cat))
If both arrays can have NaNs:
final_cat = np.where(np.isnan(ocean)|np.isnan(atmos),
np.nan,
np.fmax(ocean_cat, atmos_cat))
Or np.isnan(ocean)&np.isnan(atmos)
if you only want a NaN when both inputs are NaN.
Output:
array([[ 1., 2., 2.],
[ 3., nan, 1.],
[ 3., 1., nan]])
Generic approach for any number of input arrays:
arrays = [ocean, atmos]
bins = [bin1, bin2]
out = np.where(np.logical_or.reduce([np.isnan(a) for a in arrays]),
np.nan,
np.fmax.reduce([np.digitize(a, b) for a,b in zip(arrays, bins)])
)