Search code examples
pythonpandasnumpypython-xarraynetcdf4

How to process the Time variables of OCO-2/Tropomi NETCDF4 files using Xarray?


I am working on Tropomi .nc files. When I open the dataset using xarray, it does not process the time dimension. In Tropomi files, the time dimension is named as 'sounding_dim'. Instead of decoding the time, the returned output is just the sounding number.

I have tried on OCO-2 .nc files as well. In OCO-2, the time dimension is 'sounding_id'. In case of OCO-2, the time is returned as a floating number, not as a date. The code and the output is given by:

import numpy as np
import xarray as xr
from datetime import datetime as dt
import pandas as pd 

tropomi = xr.open_dataset('/Users/farhanmustafa/Documents/analysis/tropomi/ESACCI-GHG-L2-CH4-CO-TROPOMI-WFMD-20190102-fv1.nc', engine = 'netcdf4')
tropomi

The returned output is:

<xarray.Dataset>
Dimensions:                 (corners_dim: 4, layer_dim: 20, level_dim: 21, sounding_dim: 374749)
Dimensions without coordinates: corners_dim, layer_dim, level_dim, sounding_dim
Data variables:
    time                    (sounding_dim) datetime64[ns] ...
    latitude                (sounding_dim) float32 ...
    longitude               (sounding_dim) float32 ...
    solar_zenith_angle      (sounding_dim) float32 ...
    sensor_zenith_angle     (sounding_dim) float32 ...
    azimuth_difference      (sounding_dim) float32 ...
    xch4                    (sounding_dim) float32 ...
    xch4_uncertainty        (sounding_dim) float32 ...
    xco                     (sounding_dim) float32 ...
    xco_uncertainty         (sounding_dim) float32 ...
    quality_flag            (sounding_dim) int32 ...
    pressure_levels         (sounding_dim, level_dim) float32 ...
    pressure_weight         (sounding_dim, layer_dim) float32 ...
    ch4_profile_apriori     (sounding_dim, layer_dim) float32 ...
    xch4_averaging_kernel   (sounding_dim, layer_dim) float32 ...
    co_profile_apriori      (sounding_dim, layer_dim) float32 ...
    xco_averaging_kernel    (sounding_dim, layer_dim) float32 ...
    orbit_number            (sounding_dim) int32 ...
    scanline                (sounding_dim) int32 ...
    ground_pixel            (sounding_dim) int32 ...
    latitude_corners        (sounding_dim, corners_dim) float32 ...
    longitude_corners       (sounding_dim, corners_dim) float32 ...
    altitude                (sounding_dim) float32 ...
    apparent_albedo         (sounding_dim) float32 ...
    land_fraction           (sounding_dim) int32 ...
    cloud_parameter         (sounding_dim) float32 ...
    h2o_column              (sounding_dim) float32 ...
    h2o_column_uncertainty  (sounding_dim) float32 ...
Attributes:
    title:                     TROPOMI/WFMD XCH4 and XCO
    institution:               University of Bremen
    source:                    TROPOMI L1B version 01.00.00
    history:                   2019 - product generated with WFMD
    tracking_id:               41f8bb71-4f43-4927-843a-4f02ed013f3b
    Conventions:               CF-1.6
    product_version:           v1.2
    summary:                   Weighting Function Modified DOAS (WFMD) was ad...
    keywords:                  satellite, Sentinel-5 Precursor, TROPOMI, atmo...
    id:                        ESACCI-GHG-L2-CH4-CO-TROPOMI-WFMD-20190102-fv1.nc
    naming_authority:          iup.uni-bremen.de
    keywords_vocabulary:       NASA Global Change Master Directory (GCMD)
    cdm_data_type:             point
    comment:                   These data were produced at the University of ...
    date_created:              20200322T232210Z
    creator_name:              University of Bremen, IUP, Oliver Schneising
    creator_email:             [email protected]
    project:                   Climate Change Initiative - European Space Agency
    geospatial_lat_min:        -90
    geospatial_lat_max:        90
    geospatial_lat_units:      degree_north
    geospatial_lon_min:        -180
    geospatial_lon_max:        180
    geospatial_lon_units:      degree_east
    geospatial_vertical_min:   0
    geospatial_vertical_max:   100000
    time_coverage_start:       20190102T000000Z
    time_coverage_end:         20190102T235959Z
    time_coverage_duration:    P1D
    time_coverage_resolution:  P1D
    standard_name_vocabulary:  NetCDF Climate and Forecast (CF) Metadata Conv...
    license:                   ESA CCI Data Policy: free and open access
    platform:                  Sentinel-5 Precursor
    sensor:                    TROPOMI
    spatial_resolution:        7km x 7km at nadir (typically)

When I try to retrieve the time dimension:

tropomi.sounding_dim

<xarray.DataArray 'sounding_dim' (sounding_dim: 374749)>
array([     0,      1,      2, ..., 374746, 374747, 374748])
Dimensions without coordinates: sounding_dim

tropomi['sounding_dim'] = dt.strptime(tropomi["sounding_dim"], "%Y%m%d%H%M%S")

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-18-a749e221323c> in <module>
----> 1 tropomi['sounding_dim'] = dt.strptime(tropomi["sounding_dim"], "%Y%m%d%H%M%S")

TypeError: strptime() argument 1 must be str, not DataArray

I have tried every solution that I could find on the internet. I will be thankful if anyone helps me sorting it out. I want to mention that I have already successfully processed GEOS-CHEM .nc files and did not face any error like that.


Solution

  • It looks like you have a time variable with np.datetime64 type. You can use ds.swap_dims({"sounding_dim": "time"}) to make time the coordinate variable. See https://xarray.pydata.org/en/stable/generated/xarray.Dataset.swap_dims.html