Pretty new to programming. I am trying to alter a script that was meant to pull .txt files containing data to now pull NetCDF files from an HTTP server, download, rename, and save locally (well another server location). I've pasted the base code including actual buoy data file names for NetCDF files. I believe there is an issue at the urlrequest line. I've tried urllib.request.open
and url.request.retrieve
and both give errors.
import os
import urllib
import urllib.request
import shutil
import netCDF4
import requests
# Weblink for location of spectra and wave data
webSpectra = 'https://dods.ndbc.noaa.gov/thredds/fileServer/data/swden/41004/41004w9999.nc'
webWave = 'https://dods.ndbc.noaa.gov/thredds/fileServer/data/stdmet/41004/41004h9999.nc'
#set save location for each
saveloc = 'saveSpectra41004w9999.nc'
saveloc2 = 'saveWave41004h9999.nc'
# perform pull
try:
urllib.request.urlopen(webSpectra, saveloc)
except urllib.error.HTTPError as exception:
print('Station: 41004 spectra file not available')
print(exception)
try:
urllib.request.urlopen(webWave, saveloc2)
except urllib.error.HTTPError as exception:
print('Station: 41004 wave file not available')
print(exception)
print ('Pulling data for 41004)
print('Percent complete '+ str(round(100*(count/len(stationIndex)))))
print('Done')
My errors
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-13-5e5ebd26fe46> in <module>
59 # perform pull
60 try:
---> 61 urllib.request.urlopen(webSpectra, saveloc)
62 except urllib.error.HTTPError as exception:
63 print('Station: 41004 spectra file not available')
/work/anaconda3/envs/aoes/lib/python3.6/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
221 else:
222 opener = _opener
--> 223 return opener.open(url, data, timeout)
224
225 def install_opener(opener):
/work/anaconda3/envs/aoes/lib/python3.6/urllib/request.py in open(self, fullurl, data, timeout)
522 for processor in self.process_request.get(protocol, []):
523 meth = getattr(processor, meth_name)
--> 524 req = meth(req)
525
526 response = self._open(req, data)
/work/anaconda3/envs/aoes/lib/python3.6/urllib/request.py in do_request_(self, request)
1277 msg = "POST data should be bytes, an iterable of bytes, " \
1278 "or a file object. It cannot be of type str."
-> 1279 raise TypeError(msg)
1280 if not request.has_header('Content-type'):
1281 request.add_unredirected_header(
TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.
You just want to download the files by the looks of it. You can do this using nctoolkit (https://nctoolkit.readthedocs.io/en/latest/). This will download the files to a temporary location. You can then export to xarray or pandas etc., or just save the file.
Code below will work for one file:
import nctoolkit as nc
ds = nc.open_url('https://dods.ndbc.noaa.gov/thredds/fileServer/data/stdmet/41004/41004h9999.nc')
# convert to xarray dataset
ds_xr = ds.to_xarray()
# convert to pandas dataframe
df = ds.to_dataframe()
# save to location
ds.to_nc("outfile.nc")
If the above does not work due to dependency issues etc., you can just use urllib:
import urllib.request
url = 'https://dods.ndbc.noaa.gov/thredds/fileServer/data/stdmet/41004/41004h9999.nc'
urllib.request.urlretrieve(url, '/tmp/temp/nc')