I have been trying to read the header (first 100 lines) of a netCDF file in Python, but have been facing some issues. I am familiar with the read_nc
function available in the synoptReg package for R and with the ncread
function that comes with MATLAB, as well as the read_csv
function available in the pandas library. To my knowledge, however, there isn't anything similar for netCDF (.nc) files.
Noting this, and using answers from this question, I've tried the following (with no success):
with open(filepath,'r') as f:
for i in range(100):
line = next(f).strip()
print(line)
However, I receive this error, even though I've ensured that tabs have not been mixed with spaces and that the for
statement is within the with
block (as given as explanations by the top answers to this question):
'utf-8' codec can't decode byte 0xbb in position 411: invalid start byte
I've also tried the following:
with open(filepath,'r') as f:
for i in range(100):
line = [next(f) for i in range(100)]
print(line)
and
from itertools import islice
with open('/Users/toshiro/Desktop/Projects/CCAR/Data/EDGAR/v6.0_CO2_excl_short-cycle_org_C_2010_TOTALS.0.1x0.1.nc','r') as f:
for i in range(100):
line = list(islice(f, 100))
print(line)
But receive the same error as above. Are there any workarounds for this?
You can't. netCDFs are binary files and can't be interpreted as text.
If the files are netCDF3
encoded, you can read them in with scipy.io.netcdf_file
. But it's much more likely they are netCDF4
, in which case you'll need the netCDF4
package.
On top of this, I'd highly recommend the xarray
package for reading and working with netCDF data. It supports a labeled N-dimensional array interface - think pandas indexes on each dimension of a numpy array.
Whether you go with netCDF or xarray, netCDFs are self-describing and support arbitrary reads, so you don't need to load the whole file to view the metadata. So similar to viewing the head of a text file, you can simply do:
import xarray as xr
ds = xr.open_dataset("path/to/myfile.nc")
print(ds) # this will give you a preview of your data
Additionally, xarray does have a xr.Dataset.head
function which will display the first 5 (or N if you provide an int) elements along each dimension:
ds.head() # display a 5x5x...x5 preview of your data
See the getting started guide and the User guide section on reading and writing netCDF files for more info.