Search code examples
pythondecodebinaryfilesgribndfd

Python Decoding NDFD Grib Binary File Similar to Writing then Reading the File Using Open(path, 'wb') and xarray.load_dataset(path, engine='cfgrib')


I have the following code which works:

import pandas as pd
import requests
import xarray as xr
import cfgrib

r = requests.get(
    'https://tgftp.nws.noaa.gov/SL.us008001/ST.opnl/DF.gr2/DC.ndfd/AR.neast/VP.001-003/ds.wspd.bin', 
    stream=True)

f = open('..\\001_003wspd.grb', 'wb')
f.write(r.content)
f.close()

xr_set = xr.load_dataset('..\\001_003wspd.grb', engine="cfgrib")
windspeed_df = xr_set.to_dataframe()

What I am is wanting is to avoid writing then reading the file. Something like:

grib = r.content.decode()

or

xr_set = xr.load_dataset(r.content, engine="cfgrib")

which both give:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd7 in position 95: invalid continuation byte

I have also tried:

grib = resp.content.decode(errors='ignore')
xr_set = xr.open_dataset(grib, engine="cfgrib")

which gives:

ValueError: embedded null character

As well as trying different binary encodings.


Solution

  • I checked the cfgrib code, and in particular the open_file method, and it seems it only accepts str or os.PathLike[str] objects. So, the r.content is read as a path and it goes wrong. If you do not want to store files permanently, you could try :

    import os
    
    import requests
    import xarray as xr
    from fs.tempfs import TempFS
    
    
    def download_and_read_grib(url) -> xr.Dataset:
        with TempFS() as tempfs:
            with requests.get(url) as r:
                path = os.path.join(tempfs._temp_dir, "data.grb")
                with open(path, 'wb') as f:
                    f.write(r.content)
                ds = xr.open_dataset(path, engine='cfgrib').load()
        return ds
    
    
    download_and_read_grib('https://tgftp.nws.noaa.gov/SL.us008001/ST.opnl/DF.gr2/DC.ndfd/AR.neast/VP.001-003/ds.wspd.bin')