Search code examples
pythondictionarylist-comprehension

Obtain group boolean test result by iterating through and testing individual elements of a list object in the values part of a dictionary


I have a question about a nested list comprehension that is looking through a dictionary: I am comparing a single value to a list of values in a dictionary, all under one key. I need to look at each value individually and make a boolean assessment but I am unable to get look into the list part of the dictionary values. I've looked at a number of related posts and tried the code but couldn't make it work.

I'm using a list comprehension to perform the first passthrough of the dictionary:

    var_dic = {'vars': ['junk', 'WDIR', 'WSPD', 'WDIR', 'WSPD']}
    [x for x in var_dic.values()]  

which returns Out[157]: [['junk', 'WDIR', 'WSPD', 'WDIR', 'WSPD']] and len(var_dic.values()) returns 1

And so when I run (assigning c='WSPD' for example) [c==i for i in [x for x in var_dic.values()]] it returns a single False

it complains that x doesn't exist (see below; x is reset before the loop).

if x does exist, it returns a list of booleans, one test for each variable.

Hopefully this is enough to see what is going on. There is more info below, but I am trying a more sophisticated (for me) use of a dictionary and so I'm sure there are more appropriate approaches out there, or perhaps this approach simply needs the tweak I've been unable to find. I suspect someone will quickly be able to say "oh you can't do XY!". I checked through the posts stack suggested and tried many that seem similar, but couldn't quite get to the right place.

Thanks everyone.

************* More info section ****************

I have a series of variables that are to be read from the clipped data below.

0.20000E+01WDIR 0.12500E+02WSPD 0.00000E+00WDIR 0.00000E+00WSPD 0.10303E+04ATMS
0.10303E+04ATMS 0.58000E+01DRYT 0.74000E+01SSTP 0.15200E+02GSPD 0.00000E+00GSPD
0.54367E+02LTG$ 0.13244E+03LNG$ 0.73800E+01HAT$ 0.33000E-01LCF$ 0.40940E+04QCP$
0.00000E+00QCF$ 0.20170E+08UPD$

The data value and the variable name are read as a string, then the little function (below) is run to separate out values and variable names, and then in a dictionary the variable name as a key and the value as an associated value for that key are stored. There are many time steps in the original data, so the dictionary accumulates data under each key. But, as can be seen, WDIR WSPD ATMS and GSPD are repeated, with the second occurrence being 0. What I'm trying to do is do test as each variable is read, to see if there is another occurrence of that variable already in the list of variables, and if present, skip the append to the dictionary part. That looks like this:

    #Function to read a single value-variable string pair
    def met_pull(xystr):
        val = float(xystr[0:7])
        exp = int(xystr[9:11])
        outval = str(round(val*10**exp,1))
        charref = xystr[11:16]
        return outval,charref

(szz is the number of rows the variables occupy; four in this case)

(datadic is a dictionary with the data in it.)

            var_dic = {'vars':['junk']} #initialize a temporary list of variable IDs 
            for xx in range(param_rows): #(param_rows = all four rows of the data; use "next()" to force to next line
                zz=next(file)
                szz = zz.split()
                for xy in szz: #loop through all values read on a given line (i.e. 3 lines of 5 values and 1 line of two values)
                    v,c = met_pull(xy) # function defined above

                    if 'x' in globals(): # precautionary removal of x otherwise it has a full set of False/True values
                        del x
                        
                    if not any([c==i for i in [x for x in var_dic.values()]]):
                        if dic_flag:
                            datadic[c] = [v]
                        else:
                            datadic[c].append(v)
                    var_dic['vars'].append(c) #accumulate a list of variables that have been read in. 

This part is not working: if not any([c==i for i in [x for x in var_dic.values()]]): i is taking the value of the entire list returned by [x for x in var_dic.values()], so c is never True because it is a single four-character string.

I was trying this as well: [c==i for i in x for x in var_dic.values()] but it was holding onto x as a full set of variables or, when I introduced

    if 'x' in globals():
        del x               

to re-initialize x because [c==i for i in x for x in var_dic.values()] gives an error, even though I thought x was supposed to be created on the fly.


Solution

  • If c is a str,

    any([c==i for i in [x for x in var_dic.values()]])
    

    is equivalent to the simplified:

    c in var_dic.values()
    

    And if var_dic is like {'vars': ['junk', 'WDIR', 'WSPD', 'WDIR', 'WSPD']}, then it is also equivalent to:

    c == ['junk', 'WDIR', 'WSPD', 'WDIR', 'WSPD']  # "is c equal to this list?"
    

    Whereas I think you were trying to implement:

    any(c == x for x in var_dic['vars'])
    

    Which, simplified, is:

    c in var_dic['vars']  # "is c one of these values?"
    
    # i.e.
    c in ['junk', 'WDIR', 'WSPD', 'WDIR', 'WSPD']