I have two lists old
and new
, with the same number of elements.
I'm trying to write an efficient function that takes n
as a parameter, compares the elements of two lists at the same locations (by index), finds n
largest differences, and returns the indices of those n
elements.
I was thinking this would be best solved by a value-sorted dictionary, but one isn't available in Python (and I'm not aware of any libraries that offer it). Perhaps there's a better solution?
Whenever you think "n largest", think heapq
.
>>> import heapq
>>> import random
>>> l1 = [random.randrange(100) for _ in range(100)]
>>> l2 = [random.randrange(100) for _ in range(100)]
>>> heapq.nlargest(10, (((a - b), a, b) for a, b in zip(l1, l2)))
[(78, 99, 21), (75, 86, 11), (69, 90, 21), (69, 70, 1), (60, 86, 26), (55, 95, 40), (52, 56, 4), (48, 98, 50), (46, 80, 34), (44, 81, 37)]
This will find the x largest items in O(n log x) time, where n is the total number of items in the list; sorting does it in O(n log n) time.
It just occurred to me that the above doesn't actually do what you asked for. You want an index! Still very easy. I'll also use abs
here in case you want the absolute value of the difference:
>>> heapq.nlargest(10, xrange(len(l1)), key=lambda i: abs(l1[i] - l2[i]))
[91, 3, 14, 27, 46, 67, 59, 39, 65, 36]