Search code examples
pythonindexingpython-xarraygeotiff

Unexpected behavior when slicing xarray DataArray over the y dimension


I'm new to xarray and I'm trying to understand how sel() works.

I have a DataArray tif_xr that comes from opening a 10 band GeoTIff file with rioxarray.

tif_xr = rioxarray.open_rasterio('path/to/my/file.tif', chunks={'x':100, 'y':100})

I'm able to slice the array along the x dimension, which gives me the results that I expect.

tif_xr.sel(x=slice(500505,500520))

Gives me:

<xarray.DataArray (band: 10, y: 10900, x: 2)> Size: 436kB
dask.array<getitem, shape=(10, 10900, 2), dtype=uint16, chunksize=(10, 100, 2), chunktype=numpy.ndarray>
Coordinates:
  * band         (band) int64 80B 1 2 3 4 5 6 7 8 9 10
  * x            (x) float64 16B 5.005e+05 5.005e+05
  * y            (y) float64 87kB 4.5e+06 4.5e+06 ... 4.391e+06 4.391e+06
    spatial_ref  int64 8B 0
Attributes:
    AREA_OR_POINT:  Area
    _FillValue:     65535
    scale_factor:   1.0
    add_offset:     0.0

However, when doing the same slicing logic but this time for the y dimension, it doesn't give me any values.

tif_xr.sel(y=slice(4444550,4444560))

Which gives me:

<xarray.DataArray (band: 10, y: 0, x: 10924)> Size: 0B
dask.array<getitem, shape=(10, 0, 10924), dtype=uint16, chunksize=(10, 0, 100), chunktype=numpy.ndarray>
Coordinates:
  * band         (band) int64 80B 1 2 3 4 5 6 7 8 9 10
  * x            (x) float64 87kB 5.005e+05 5.005e+05 ... 6.097e+05 6.097e+05
  * y            (y) float64 0B 
    spatial_ref  int64 8B 0
Attributes:
    AREA_OR_POINT:  Area
    _FillValue:     65535
    scale_factor:   1.0
    add_offset:     0.0

It is weird because when I use sel() with equals instead of slice (tif_xr.sel(y=4444555)) it seems to work:

<xarray.DataArray (band: 10, x: 10924)> Size: 218kB
dask.array<getitem, shape=(10, 10924), dtype=uint16, chunksize=(10, 100), chunktype=numpy.ndarray>
Coordinates:
  * band         (band) int64 80B 1 2 3 4 5 6 7 8 9 10
  * x            (x) float64 87kB 5.005e+05 5.005e+05 ... 6.097e+05 6.097e+05
    y            float64 8B 4.445e+06
    spatial_ref  int64 8B 0
Attributes:
    AREA_OR_POINT:  Area
    _FillValue:     65535
    scale_factor:   1.0
    add_offset:     0.0

I tried different slice ranges, with no success.

My GeoTiff is in a projected crs, so the x and y values that I used in this example are actual coordinates in meters.

What could be happening here? Am I missing something?

Thanks.


Solution

  • Take a close look at the y-coordinate of your dataset: The values are actually decreasing. Consequently, when you slice it, you have to provide the upper bound before the lower bound:

    tif_xr.sel(y=slice(4444560, 4444550))
    

    should give the expected result. Alternatively, you can also specify a negative stride:

    tif_xr.sel(y=slice(4444550, 4444560, -1))
    

    should also work.

    My suggestion would be to invert the y-axis directly after loading the dataset to avoid this problem completely:

    # Invert the y-axis to have increasing coordinates
    tif_xr = tif_xr.isel(y=slice(None, None, -1))
    

    Then, all your sel()-commands should work as expected.