Search code examples
pythonlistcomparison

Check how many elements of one list are contained in a second list, including duplicates


I want to compare some lists (e.g. l2,l3) to one big list (l1), based on the number of occurencies, for example:

l1 = ['s1', 's1', 's1', 's2']
l2 = ['s1', 's2']
l3 = ['s1', 's1', 's1']

In my scenario, l1 is closer to l3, because (when also considering the number of occurences), the difference between l1 and l3 is only ['s2'].

The usual approach of comparing list elements by converting them to a set and intersecting them does not work here, since the duplicates are removed.

I would like to have an output like this: compare(l1,l2) = ['s1', 's2'] ("These two elments of l2 were found in l1") compare(l1,l3) = ['s1', 's1', 's1']

Is there an operator / a function to do so or a better data structure than a list?


Solution

  • You can use the intersection operator & of the collections.Counter class:

    from collections import Counter
    def compare(l1, l2):
        return list((Counter(l1) & Counter(l2)).elements())
    

    So that compare(l1, l2) returns:

    ['s1', 's2']
    

    and that compare(l1, l3) returns:

    ['s1', 's1', 's1']