Let's say we open one NetCDF file and get a DataArray da
like
<xarray.DataArray (x: 2, y: 3)>
array([[0.50793919, 0.49505336, 0.19573345],
[0.7830897 , 0.82954952, 0.19427877]])
Coordinates:
* x (x) int64 0 1
* y (y) int64 0 1 2
Now, our target DataArray da_new
looks like
<xarray.DataArray (x: 4, y: 3)>
array([[0.50793919, 0.49505336, 0.19573345],
[0.7830897 , 0.82954952, 0.19427877],
[ nan, nan, nan],
[ nan, nan, nan]])
Coordinates:
* x (x) int64 0 1 2 3
* y (y) int64 0 1 2
To reach our target, we can construct one new DataArray and refill it using the da
data, something like
da_new = xr.DataArray(
data = np.full([4,3], fill_value=np.nan),
dims = ['x','y'],
coords=dict(
x = range(4),
y = range(3)
)
)
da_new.loc[0:1,:] = da
However, in my side, this method is a little bit tiring, especially when there are many dimensions of the DataArray.
So, I'm wondering is there any simple and explicit method to do this. Many thanks.
The sequence of steps is sound (create new placeholder array, then copy data into it), but we can use the dimensions from the original so we do not have to hard code dimensions and coordinates for the new array if they are the same as the original.
import numpy as np
import xarray as xr
# Toy data.
ar = np.array([
[0.50793919, 0.49505336, 0.19573345],
[0.7830897 , 0.82954952, 0.19427877]])
da = xr.DataArray(
data = ar,
dims = ['x','y'],
coords=dict(
x = range(2),
y = range(3)
))
da
# array([[0.50793919, 0.49505336, 0.19573345],
# [0.7830897 , 0.82954952, 0.19427877]])
# Coordinates:
# x (x) int64 0 1
# y (y) int64 0 1 2
# Map some dimensions to new coordinates of any length.
new_coords = dict(x=range(4))
# Make empty placeholder array, replacing some coordinates with new ones.
da_bigger = xr.DataArray(
dims=da.dims,
coords=dict(da.coords, **new_coords))
da_bigger
# array([[nan, nan, nan],
# [nan, nan, nan],
# [nan, nan, nan],
# [nan, nan, nan]])
# Coordinates:
# x (x) int64 0 1 2 3
# y (y) int64 0 1 2
# Copy data from original into corresponding coordinates of bigger array.
da_bigger.loc[{k: da[k] for k in new_coords}] = da
da_bigger
# array([[0.50793919, 0.49505336, 0.19573345],
# [0.7830897 , 0.82954952, 0.19427877],
# [ nan, nan, nan],
# [ nan, nan, nan]])
# Coordinates:
# x (x) int64 0 1 2 3
# y (y) int64 0 1 2
Assumption: The new coordinates are a superset of the original.
xr.align
We can get xr.align
to do the heavy lifting for the copy operation. In this example the empty placeholder array is passed directly into align
without storing it in a named variable.
da_big2, _ = xr.align(
da,
xr.DataArray(
dims=da.dims,
coords=dict(da.coords, **new_coords)),
join="outer")
da_big2
# array([[0.50793919, 0.49505336, 0.19573345],
# [0.7830897 , 0.82954952, 0.19427877],
# [ nan, nan, nan],
# [ nan, nan, nan]])
# Coordinates:
# x (x) int64 0 1 2 3
# y (y) int64 0 1 2