python string conditional-statements logarithm

Python: string comparison with double conditions

Trying to search 2 lists for common strings. 1^-st list being a file with text, while the 2^-nd is a list of words with logarithmic probability before the actual word – to match, a word not only needs to be in both lists, but also have a certain minimal log probability (for instance, between -2,123456 and 0,000000; that is negative 2 increasing up to 0). The tab separated list can look like:

-0.962890   dog
-1.152454   lol
-2.050454   cat

I got stuck doing something like this:

common = []
for i in list1:
    if i in list2 and re.search("\-[0-1]\.[\d]+", list2):
        common.append(i)

The idea to simply preprocess the list to remove lines under a certain threshold is valid of course, but since both the word and its probability are on the same line, isn’t a condition also possible? (Regexps aren’t necessary, but for comparison solutions both with and without them would be interesting.)

EDIT: own answer to this question below.

Solution

Answering my own question after hours of trial and error, and read tips from here and there. Turns out, i was thinking in the right direction from start, but needed to separate word detection and pattern matching, and instead combine the latter with log probability checking. Thus creating a temporary list of items with needed log prob, and then just comparing that against the text file.

    common = []
    prob = []
    loga , rithmus =   -9.87   ,   -0.01

    for i in re.findall("\-\d\.\d+", list2):
        if (loga < float(i.split()[0]) < rithmus):
            prob.append(i)

    prob = "\n".join(prob)
    for i in list1:
        if i in prob:
            common.append(i)