I want to compare some lists (e.g. l2,l3
) to one big list (l1
), based on the number of occurencies, for example:
l1 = ['s1', 's1', 's1', 's2']
l2 = ['s1', 's2']
l3 = ['s1', 's1', 's1']
In my scenario, l1
is closer to l3
, because (when also considering the number of occurences), the difference between l1 and l3 is only ['s2']
.
The usual approach of comparing list elements by converting them to a set
and intersecting them does not work here, since the duplicates are removed.
I would like to have an output like this:
compare(l1,l2) = ['s1', 's2']
("These two elments of l2 were found in l1")
compare(l1,l3) = ['s1', 's1', 's1']
Is there an operator / a function to do so or a better data structure than a list?
You can use the intersection operator &
of the collections.Counter
class:
from collections import Counter
def compare(l1, l2):
return list((Counter(l1) & Counter(l2)).elements())
So that compare(l1, l2)
returns:
['s1', 's2']
and that compare(l1, l3)
returns:
['s1', 's1', 's1']