I have data with the following xarray DataSet representation:
df = pd.DataFrame({
'var': np.random.rand(16),
'country': 4*['UK'] + 4*['US'] + 4*['FR'] + 4*['DE'],
'time_delta': 4*list(pd.timedelta_range(
start='30D',
periods=4,
freq='30D'
)),
})
ds = df.set_index(['country','time_delta']).to_xarray()
I want to add new value of the variable for a given new coord and a given dimension while maintaining the existing dimensions:
Set value=0 of variable=var for coord='0D' of dimension=time_delta while preserving other existing dimensions (in that case country).
In pandas I can do this via:
# 1. Pivot long to wide
df_wide = df.pivot(
index='country',
columns='time_delta'
).droplevel(0,axis=1)
# 2. Set value
df_wide['0D'] = 0
# 3. Melt wide to long
df_new = df_wide.melt(ignore_index=False).rename(
columns={'value': 'var'}
).reset_index()
ds_new = df_new.set_index(['country','time_delta']).to_xarray()
Is there a general way of making such operations in xarray in order to achieve directly ds -> ds_new
?
Edit: The below fails with message that: "KeyError: "not all values found in index 'time_delta'"
ds['var'].loc[{'time_delta': '0D'}] = 0
In Xarray, you can achieve the same result using the following approach. Here's the code to perform the operation you described:
import pandas as pd
import numpy as np
import xarray as xr
# Original dataset creation
df = pd.DataFrame({
'var': np.random.rand(16),
'country': 4*['UK'] + 4*['US'] + 4*['FR'] + 4*['DE'],
'time_delta': 4*list(pd.timedelta_range(
start='30D',
periods=4,
freq='30D'
)),
})
ds = df.set_index(['country', 'time_delta']).to_xarray()
# Adding new value for the given coordinate and dimension
ds_new = ds.copy() # Make a copy to avoid modifying the original
dataset
# Add a new coordinate to the 'time_delta' dimension
new_time_delta = pd.to_timedelta(['0D'])
ds_new['var'] = xr.concat([ds_new['var'], xr.DataArray([0],
dims='time_delta', coords={'time_delta': new_time_delta})],
dim='time_delta')
# Display the new dataset
print(ds_new)
*This is an edited version of the original answer!
This code makes use of Xarray's .loc
accessor to set the value of the 'var' variable to 0 for the specified coordinate ('0D') in the 'time_delta' dimension. This operation modifies the ds_new
dataset while keeping the original ds
dataset unchanged.