While this question may seem to be related to previous ones (like this one: Python, compute list difference), it is not exactly the same, and even the best rated answer containing two suggestions will not exactly answer the following one.
I have a main (unordered) list L
containing values with duplicates; take for instance a list of integers:
L = [3, 1, 4, 1, 5, 9, 2, 6, 5]
I have a smaller list contaning a choice of values from L, for instance:
x = [4, 1, 3]
The order of the elements in x
is not related in any way to the order of the elements in L
.
Now, I would like to compute the difference L-x
in such a way that concatenating x
and this difference would give the same list than L
(except for the order); to be more precise:
list(sorted(x + D(L,x))) == list(sorted(L))
The first bad idea is obviously to use sets, since duplicated will not be handled correctly.
The second bad idea is to use some list comprehension with a filter like:
[ e for e in L if e not in x ]
since the value 1
in my example will be discarded though one instance of this value should occur in the expected difference.
As far as I can see, the most efficient way of doing it would be to sort both lists then iterate on both lists (an iterator could be helpful) and carefully take duplicates into account; this would be a O(n log n) solution.
I am not looking for speed; I rather wonder if some concise pythonic syntax could do it; even O(n²) or worse could be acceptable if it could do the expected task in one or two lines.
You want the multiset operations provided by collections.Counter
:
>>> L = [3, 1, 4, 1, 5, 9, 2, 6, 5]
>>> x = [4, 1, 3]
>>> list((Counter(L) - Counter(x)).elements())
[1, 5, 5, 9, 2, 6]
This is O(n). You can also preserve order and maintain O(n) using an OrderedCounter
if necessary.
from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
pass