I have a binary data file called image_info_binary.data
, and I'd like to download many FITS images based on the information in the lines of this file. If I load this file into Python with the pickle
module and print a single element, I get this:
import pickle
with open('image_info_binary', 'rb') as f:
img_info = pickle.load(f)
print(img_info[0])
Outputs this string:
Object #: 2000073.0 Counter #: 2 Scan ID: 0245 Frame #: 167 Band #: 3 Image Link: http://....fits... #long url
There are about 50,000 of these elements, each with different object #, counter #, fits image URL, etc. I would like to go through each of these elements and download each FITS image as: {int(object number)}_{three digit counter}_w{band}.fits
.
For example, I would want the downloaded image of the above example to be 2000073_002_w3.fits
.
What is the best way to do this? I know if I was just downloading one image I could simply execute curl -o 2000073_002_w3.fits "url"
, for example. I'm not sure if generating many of these curl statements is the best way to do this or not. If I could just run a command in the terminal, that'd be great, but I could also use Python (but I feel like a subprocess would probably be slow). Thank you!
You can generate the URLs by iterating over the objects and splitting them into parts.
for img in img_info:
attr = dict()
for line in img.split('\n'):
key, value = line.split(': ', 1)
attr[key] = value
filename = '{0}_{1:03}_w{2}.fits'.format(
attr['Object #'], attr['Counter #'], attr['Band #'])
url = attr['Image Link']
You can then print these, or pass them to subprocess.run(['curl', '-o', filename, url], check=True)
or download them natively in Python.