I have a for
loop that compares a substring of each element in a list of strings to the elements in another list of strings.
mylist = []
for x in list1:
mat = False
for y in list2:
if x[:-14] in y:
mat = True
if not mat:
mylist.append(x)
However I would like to put it in a list comprehension (for loops aren't as concise for my tastes) But can't find a way to do it with the calculation of mat
.
I have tried variations on:
mylist = [x for x in list1 if x[:-14] in list2]
But this is not the same logic as the original loop. Is there a way to reform the original for loop into list comprehension?
As it is written, no you cannot directly write it as a list comprehension as there is no place in a comprehension for intermediate variables like mat
.
however, if you refactor the intermediate variable to be computed as a separate function:
def check_match(x, seq):
"checks if text excluding the last 14 characters of string `x` is a substring of any elements of the list `seq`"
mat = False
for y in seq:
if x[:-14] in y:
mat = True
return mat
mylist = []
for x in list1:
mat = check_match(x, list2)
if not mat:
mylist.append(x)
now we can move the function call directly into the if not
condition, also in your particular case you are applying the logic of the any
function so you can simplify your function:
def check_match(x, seq):
"checks if text excluding the last 14 characters of string `x` is a substring of any elements of the list `seq`"
return any((x[:-14] in y) for y in seq)
mylist = []
for x in list1:
if not check_match(x, list2):
mylist.append(x)
Now that we don't have any intermediate variables, it is pretty straight forward to convert:
mylist = [x for x in list1 if not any((x[:-14] in y) for y in list2)]
# or if your case can't be easily refactored into a single expression just leave it as a separate function
# mylist = [x for x in list1 if not check_match(x, list2)]
alternatively, if you keep it as a separate function that returns True
for the elements to keep and only takes the element as an argument you could then use filter
directly:
def is_not_in_list2(x):
"checks if x[:-14] is *not* a substring of any elements of `list2`"
return not any((x[:-14] in y) for y in list2)
myiterator = filter(is_not_in_list2, list1)
# you can loop over `myiterator` once and it will be very memory efficient, or if you need it as a proper list just pass it to the list constructor
mylist = list(myiterator)