I have netcdfs saved in a gz file and I am trying to import as a geodataframe on python. I don't know the name of the variables in the netcdfs.
My code:
gzipped_file_path = 'Maize_1970_Yield_ver12b_BRA.nc.gz'
with gzip.open(gzipped_file_path, 'rb') as f:
# Read the content of the gzipped file
content = f.read()
This part works fine, but then, when trying to create a dataset, I'm trying:
df=nc.Dataset(content)
And it starts to run forever (it`s been running for over 3 hours as of now). What is wrong with this code?
Okay, so the nc.Database
function expects to get a file name to an opened file, or a File ID.
So first, lets get the file opening ready:
import gzip, os
import netCDF4 as nc
gzipped_file_path = 'Maize_1970_Yield_ver12b_BRA.nc.gz'
temp_nc_path = 'temp_netcdf_file.nc'
with gzip.open(gzipped_file_path, 'rb') as f_in, open(temp_nc_path, 'wb') as f_out:
f_out.write(f_in.read())
Now, f_out
is an opened file that essentially contains the contents of f_in
, and you can work with the nc.Database
function:
ds = nc.Dataset(temp_nc_path)
print(ds.variables.keys()) # check the keys
Finally, close the file and delete the temporary file in order to avoid garbage in your system and having to do maintenance on tmp
folders later on:
ds.close()
os.remove(temp_nc_path)
That should do it.