Open netcdfs data in gz file

I have netcdfs saved in a gz file and I am trying to import as a geodataframe on python. I don't know the name of the variables in the netcdfs.

My code:


gzipped_file_path = 'Maize_1970_Yield_ver12b_BRA.nc.gz'


with gzip.open(gzipped_file_path, 'rb') as f:
    # Read the content of the gzipped file
    content = f.read()

This part works fine, but then, when trying to create a dataset, I'm trying:

df=nc.Dataset(content)

And it starts to run forever (it`s been running for over 3 hours as of now). What is wrong with this code?

Solution

Okay, so the nc.Database function expects to get a file name to an opened file, or a File ID.

So first, lets get the file opening ready:

import gzip, os
import netCDF4 as nc

gzipped_file_path = 'Maize_1970_Yield_ver12b_BRA.nc.gz'
temp_nc_path = 'temp_netcdf_file.nc'

with gzip.open(gzipped_file_path, 'rb') as f_in, open(temp_nc_path, 'wb') as f_out:
    f_out.write(f_in.read())

Now, f_out is an opened file that essentially contains the contents of f_in, and you can work with the nc.Database function:

ds = nc.Dataset(temp_nc_path)
print(ds.variables.keys()) # check the keys

Finally, close the file and delete the temporary file in order to avoid garbage in your system and having to do maintenance on tmp folders later on:

ds.close()
os.remove(temp_nc_path)

That should do it.