Conditional statment regarding various regex and length of a list in python

I have following list :

  ['E12.2', 'E16.1', 'E15.1']
  ['E10.1', 'I11.2', 'I10.1_27353757']
  ['E16.1', 'E18.1', 'E17.3']
  ['E1.8', 'I12.1_111682336', 'I12.1_111682195']
  ['E55.1', 'E57.1', 'E56.1','E88.1']
  ['U22.3', 'U22.6_13735517', 'U23.1']

and I want to put a condition to filter out the lists that have a) length equal to 3 b) not containing '_' c) not containing alphabet 'U' I am trying to implement in one line, how do I do that? I have following condition working and I know you can use regex module for matching regex in lists but can I do all the conditions in single line?

 if(len(fin_list) == 3)

Solution

This is one possible way:

lists = [['E12.2', 'E16.1', 'E15.1'],
         ['E10.1', 'I11.2', 'I10.1_27353757'],
         ['E16.1', 'E18.1', 'E17.3'],
         ['E1.8', 'I12.1_111682336', 'I12.1_111682195'],
         ['E55.1', 'E57.1', 'E56.1','E88.1'],
         ['U22.3', 'U22.6_13735517', 'U23.1']]

for lst in lists:
    if len(lst) != 3 and not any('_' in item or 'U' in item for item in lst):
        print(lst)

# Output:
# ['E55.1', 'E57.1', 'E56.1', 'E88.1']

The interesting bit here is the use of any over a generator expression. To break it down, this iterates over each item in lst and applies a test to see if _ or U are in it. That list comprehension results in True/False for each item in the list. any then looks for the first True. If it finds one, it immediately returns True. If it doesn't find one, it returns False.

EDIT

Okay, we've clearly moved into the "Just because you can doesn't mean you should," territory, but here's a solution that incorporates the new condition introduced in the comments:

from collections import Counter
import re

lists = [['E12.2', 'E16.1', 'E15.1'],
         ['E10.1', 'I11.2', 'I10.1_27353757'],
         ['E16.1', 'E18.1', 'E17.3'],
         ['E1.8', 'I12.1_111682336', 'I12.1_111682195'],
         ['E55.1', 'E57.1', 'E56.1','E88.1'],
         ['U22.3', 'U22.6_13735517', 'U23.1'],
         ['E7.2', 'E9.5', 'E9.3']]

for lst in lists:
    if (len(lst) != 3 and not any('_' in item or 'U' in item for item in lst) and
            (Counter(match.groups(1) for match in [re.match(r'E(\d+)\.', item) for item in lst] if match is not None)
             .most_common(1) or [(None, 1)])[0][1] == 1):
        print(lst)

# Output:
# ['E55.1', 'E57.1', 'E56.1', 'E88.1']

Counter counts things, re.match tries to find the numbers after Es, and the .most_common(1) or [(None, 1)] is to make sure that even if there are no matching elements, we can still index into the result and look for the greatest number of occurrences.

Although the earlier code was okay, this is now terrible code and should be moved out to another function instead. :-)