Search code examples
pythonarraysdatasetmeanpython-xarray

How to compute one mean dataset of 4 datasets [Python, xarray]


I'm having 4 [GFS] temperature datasets: they are all the same in spatial resolution, the only difference is timestamp - they are for 00 UTC, 06 UTC, 12 UTC, 18 UTC within one day.

I need to compute the mean daily temperature dataset. Is there a way to do instrumentally, but not, like, manually pop values from corresponding nodes, compute mean and insert into a dataset?

import xarray as xr

dst00 = xr.open_dataset('gfs.t00z.pgrb2.1p00.f000', engine='cfgrib')
dst06 = xr.open_dataset('gfs.t06z.pgrb2.1p00.f000', engine='cfgrib')
dst12 = xr.open_dataset('gfs.t12z.pgrb2.1p00.f000', engine='cfgrib')
dst18 = xr.open_dataset('gfs.t18z.pgrb2.1p00.f000', engine='cfgrib')

Direct links to download sample datasets:

00UTC 06UTC 12UTC 18UTC


Solution

  • You can use xr.concat:

    dst = xr.concat([dst00, dst06], dim='time').mean('time')
    print(dst)
    
    # Output
    <xarray.Dataset>
    Dimensions:            (latitude: 31, longitude: 101)
    Coordinates:
        step               timedelta64[ns] 00:00:00
        heightAboveGround  float64 2.0
      * latitude           (latitude) float64 60.0 61.0 62.0 63.0 ... 88.0 89.0 90.0
      * longitude          (longitude) float64 0.0 1.0 2.0 3.0 ... 98.0 99.0 100.0
    Data variables:
        t2m                (latitude, longitude) float32 279.8 279.5 ... 243.4 243.4
    

    Check:

    import pandas as pd
    out = pd.concat([d.to_dataframe()['t2m'] for d in [dst00, dst06, dst]], axis=1, 
                     keys=['dst00', 'dst06', 'mean'])
    print(out)
    
    # Output
                             dst00       dst06        mean
    latitude longitude                                    
    60.0     0.0        279.837494  279.800812  279.819153
             1.0        279.557495  279.370789  279.464142
             2.0        279.437469  279.100800  279.269135
             3.0        279.227478  278.850800  279.039124
             4.0        278.417480  278.190796  278.304138
    ...                        ...         ...         ...
    90.0     96.0       243.507477  243.370804  243.439148
             97.0       243.507477  243.370804  243.439148
             98.0       243.507477  243.370804  243.439148
             99.0       243.507477  243.370804  243.439148
             100.0      243.507477  243.370804  243.439148
    
    [3131 rows x 3 columns]