Search code examples
pythonarraysnumpyglobfits

Arrays returning as empty when using an 'if statement' after glob.glob on FITS Files


I am using glob.glob make my script only read the data from certain FITS files (astropy.io.fits is brought in as pf and numpy as np). For this x is the value that I change to get these certain files (For reference the 'x = np.arrange(0) and y1 = np.arange(0) simply creates an empty array that I then fill with data later.

def Graph(Pass):
    x = np.arange(0)
    y1 = np.arange(0)

    pathfile = '*_v0' + str(Pass) + '_stis_f25srf2_proj.fits'

        for name in glob.glob(pathfile):
             imn = 'FilePath' + str(name)

However, I wanted to add another filter to the files that I use. In each FITS file's header there is a quality I will call a. a is a non-integer numerical value that each file has. I only want to read files that have a within a specific range. I then take the data I need from the FITS file and add it to an array (for this is is 'power' p1 being added to y1 and 'time' t being added to x).

            imh = pf.getheader(imn)
            a = imh['a']

            if (192 <= a <= 206) is False:
                pass

            if (192 <= a <= 206) is True:
                im = pf.getdata(imn, origin='lower')
                subim1 = im[340:390, 75:120]
                p1 = np.mean(subim1)

                t = SubfucntionToGetTime

                y1 = np.append(y1, p1)
                x = np.append(x, t)

However when I run this function it returns with arrays with no values. I believe it is something to do with my code not working properly when it encounters a file without the appropriate a value, but I can't know how to fix this.

For additional reference I have tested this for a smaller subgroup of FITS files that I know have the correct a values and it works fine, that is why I suspect it is experiencing a values that messes-up the code as the first few files don't have the correct a values.


Solution

  • There's a lot going on here, and the code you posted isn't even valid (has indentation errors). I don't think there's a useful question here for Stack Overflow because you're misusing a number of things without realizing it. That said, I want to be helpful so I'm posting an answer instead of just a comment because I format code better in an answer.

    First of all, I don't know what you want here:

    pathfile = '*_v0' + str(x) + '.fits'
    

    Because before this you have

    x = np.arange(0)
    

    So as you can check, str(x) is just a constant--the string '[]'. So you're saying you want a wildcard pattern that looks like '*_v0[].fits' which I doubt is what you want, but even if it is you should just write that explicitly without the str(x) indirection.

    Then in your loop over the glob.glob results you do:

    imn = 'FilePath' + str(name)
    

    name should already be a string so no need to str(name). I don't know why you're prepending 'FilePath' because glob.glob returns filenames that match your wildcard pattern. Why would you prepend something to the filename, then?

    Next you test (192 <= a <= 206) twice. You only need to check this once, and don't use is True and is False. The result of a comparison is already a boolean so you don't need to make this extra comparison.

    Finally, there's not much advantage to using Numpy arrays here unless you're looping over thousands of FITS files. But using np.append to grow arrays is very slow since in each loop you make a new copy of the array. For most cases you could use Python lists and then--if desired--convert the list to a Numpy array. If you had to use a Numpy array to start with, you would pre-allocate an empty array of some size using np.zeros(). You might guess a size to start it at and then grow it only if needed. Since you're looping over a list of files you could use the number of files you're looping over, for example.

    Here's a rewrite of what I think you're trying to do in more idiomatic Python:

    
    def graph(n_pass):
        x = []
        y1 = []
    
        for filename in glob.glob('*_v0.fits'):
            header = pf.getheader(filename)
            a = header['a']
            if not (192 <= a <= 206):
                # We don't do any further processing for this file
                # for 'a' outside this range
                continue
    
            im = pf.getdata(filename, origin='lower')
            subim1 = im[340:390, 75:120]
            p1 = np.mean(subim1)
            t = get_time(...)
            y1.append(p1)
            x.append(t)
    

    You might also consider clearer variable names, etc. I'm sure this isn't exactly what you want to do but maybe this will help give you a little better structure to play with.