Search code examples
pythonpandasnumpypython-xarray

For handling the idxmax in xarray


I have a xarray dataarray, which is defined as , (see this link)

xarray.DataArray (y:3,x:5)

array([[ 2.,  1.,  2.,  0., -2.],
       [-4., nan,  2., nan, -2.],
       [nan, nan,  1., nan, nan]])

Coordinates:
y (y)int64   -1 0 1
x (x)int64   0 1 4 9 16

If I want to find the first appearance of the value larger than 0 along the x dimension, I can use the command like (array>0).idxmax(dim='x'), which will return

xarray.DataArray 'x' (y: 3)

array([0, 4, 0])

Coordinates:
y (y)int64   -1 0 1

When I changed the threshold from 0 to 4, i.e., (array>4).idxmax(dim='x'), the expected results of the dataarray are [np.nan, np.nan, np.nan], becasue there are no value larger than 4 in the array. However, it returns the [0,0,0].

I'm wondering if is there any way to solve my puzzle, i.e., return [np.nan, np.nan, np.nan] rather than [0,0,0]

Thanks in advance.


Solution

  • You can try using .where to substitute the False values with nan:

    import xarray as xr
    
    array = xr.DataArray(
        [
            [2.0, 1.0, 2.0, 0.0, -2.0],
            [-4.0, np.nan, 2.0, np.nan, -2.0],
            [np.nan, np.nan, 1.0, np.nan, np.nan],
        ],
        dims=["y", "x"],
        coords={"y": [-1, 0, 1], "x": np.arange(5.0) ** 2},
    )
    
    print(array.where(array > 0, np.nan).idxmax(dim="x"))
    print(array.where(array > 4, np.nan).idxmax(dim="x"))
    

    Prints:

    <xarray.DataArray 'x' (y: 3)>
    array([0., 4., 4.])
    Coordinates:
      * y        (y) int64 -1 0 1
    
    <xarray.DataArray 'x' (y: 3)>
    array([nan, nan, nan])
    Coordinates:
      * y        (y) int64 -1 0 1