Search code examples
pythonpython-3.xastropy

Why is my code finding more objects from the same list within the same images after each run?


I wrote a code that looks through a pre-existing folder of images, and uses a constant list of object names and their corresponding ra, dec positions in the sky to locate them in each original image and make a 10x10 arcsec cutout (if the object is in the image). It is working and I am getting some nice cutouts, but for some odd reason, each time I run it there are new cutout images saved! I really don't know why this is happening, since the list of objects and their ra, dec positions are always the exact same, and I always save the cutout images with the exact name of the image and object which should not change. All the original images are staying constant as well.

I have conducted many tests and am still stumped -- through my tests I've confirmed that the list of objects (objs) remains identical in each run, and I reached the same conclusion for the list of the original images (all_images) and ra, dec positions (the lists ras_hms and decs_deg).

The original number of images and objects are both quite long, so I ran my code on smaller subsets to test, and the issue of new cutout images appearing during each run is still occurring. I ran the code below on these images: 'calexp-HSC-I-18012-1,7.fits', 'calexp-HSC-I-18114-0,0.fits', 'calexp-HSC-I-18114-1,1.fits' which are saved in the directory /Users/myuser/Desktop/Original_Images/. I am running my code in a different directory where the cutouts are also saved in the end. The first time it ran, I generated these cutouts: 'cutout-IMG-HSC-I-18114-0,0-OBJ-NEP175719.1+652743.2.fits', 'cutout-IMG-HSC-I-18114-1,1-OBJ-NEP175509.2+653523.9.fits'. When I ran the exact same code a few minutes later without changing anything, I also generated these two new images: 'cutout-IMG-HSC-I-18114-0,0-OBJ-NEP175654.7+652930.2.fits', 'cutout-IMG-HSC-I-18114-1,1-OBJ-NEP175458.4+653419.1.fits', and so on similarly for future runs.

As you can see, it must've not found any objects in one of my images (which is fine), but each run it somehow finds a new object in each of the other images (indeed, each time I run this code on this small subset I see that two new cutout images were saved, with different object names). I am stumped, since like I said the objects and coordinates it searches for are the same in each image each run. Any insights or guesses would be greatly appreciated!

import astropy
from astropy.nddata.utils import Cutout2D, NoOverlapError
import numpy as np
import matplotlib.pyplot as plt
from astropy import units as u
from astropy.io import fits
from astropy.wcs import WCS
from astropy.coordinates import SkyCoord, Angle
import re
import glob

def make_cutouts(img_file, box_len):
    # Image data
    hdulist = fits.open(img_file)
    img_data = fits.getdata(img_file)
    img_name = re.search(r'calexp\-(.+)\.fits', img_file)[1]

    # Make cutouts (by locating coords in image with WCS)
    wcs = WCS(hdulist[1].header)
    for i in range(len(objs)):
        # Skip if cutout already exists
        if 'cutout-IMG-'+img_name+'-OBJ-'+objs[i]+'.fits' in glob.glob('cutout*.fits'):
            print('Cutout of object already exists for this image, skipping...')
            continue

        # Convert ra, dec to HMS for specific object
        ra_h = re.search(r'h=(\d+.?\d*)', str(ras_hms[i]))[1]
        ra_m = re.search(r'm=(\d+.?\d*)', str(ras_hms[i]))[1]
        ra_s = re.search(r's=(\d+.?\d*)', str(ras_hms[i]))[1]
        ra_angle = Angle((float(ra_h), float(ra_m), float(ra_s)), unit='hourangle')
        dec_angle = decs_deg[i]

        # Coordinate transformation to pixels
        center = SkyCoord(ra_angle, dec_angle, frame='fk5')
        xp, yp = astropy.wcs.utils.skycoord_to_pixel(center, wcs=wcs, origin=1)

        # Make cutout, skip when object is not in image
        size = u.Quantity((box_len,box_len),u.arcsec)
        try:
            co = Cutout2D(img_data,(xp, yp),size,wcs=wcs)
        except NoOverlapError:
            continue
        hdu = fits.PrimaryHDU(data=co.data,header=co.wcs.to_header())
        hdu.writeto('cutout-IMG-'+img_name+'-OBJ-'+objs[i]+'.fits', overwrite=True)
        return 


# Gather all original images

all_images = glob.glob('/Users/myuser/Desktop/Original_Images/calexp*.fits')
coords_file = 'good_dataset.fits'

# Coordinates

hdul = fits.open(coords_file)
coords_data = hdul[1].data
objs = coords_data['Name']
ras = np.array(coords_data['RA']) # in decimal degrees
decs = np.array(coords_data['DEC']) # in decimal degrees
# Convert coordinate systems using astropy
decs_deg = Angle(decs, unit=u.deg)
ras_deg = Angle(ras, unit=u.deg)
ras_hms = [ra.hms for ra in ras_deg]

count=0
for image in all_images:
    make_cutouts(image, 10.0)
    count+=1
    print('Image %d out of %d completed' % (count, len(all_images)))

Here is an example of the output from my print statements of a run just now, which again generated two new cutout images (different objects, same two images)... Here Image 2 is the one where no objects were ever found. Also, interestingly the number of "already exists, skipping" statements increases by two for each image each run.

Cutout of object already exists for this image, skipping...
Cutout of object already exists for this image, skipping...
Cutout of object already exists for this image, skipping...
Cutout of object already exists for this image, skipping...
Image 1 out of 3 completed
Image 2 out of 3 completed
Cutout of object already exists for this image, skipping...
Cutout of object already exists for this image, skipping...
Cutout of object already exists for this image, skipping...
Cutout of object already exists for this image, skipping...
Image 3 out of 3 completed

Solution

  • This is a simple error: you have a return statement at the end of your for loop, which means that each run of make_cutouts is limited to producing at most one cutout. Each time you run it, it is producing the first cutout, then the next time, it sees that one exists, skips it with a continue statement, then gets the next one, then quits. Delete the return statement, and the code will probably work fine.

    However, there are several things you're doing that you should try to avoid:

    (1) You're using global variables in your function. You'd be better off passing objs and ras_hms as arguments to the function, rather than relying on the global state to access them.

    (2) You're looping over indices when you could just loop over the objects themselves, i.e., for thisobj, thisra in zip(objs, ras_hms):

    (3) More minor, but if 'cutout-IMG-'+img_name+'-OBJ-'+objs[i]+'.fits' in glob.glob('cutout*.fits'): would be more efficient as if os.path.exists(if 'cutout-IMG-'+img_name+'-OBJ-'+objs[i]+'.fits' in glob.glob('cutout*.fits'):. You might also find it more readable if you use 'cutout-IMG-{img_name}-OBJ-{obj_id}.fits'.format(img_name=img_name, obj_id=objs[i]) as your string