I have two long lists in the below format :
List1 = [(316, 187),
(316, 188),
(316, 189),
(316, 190),
(316, 191),
(316, 192),
(316, 193),
(317, 186),
(317, 187),
(317, 188),
(317, 189),
(317, 190)]
and so on till 1000 records
List2 = [(180, 118),
(180, 119),
(180, 120),
(180, 121),
(180, 122),
(180, 123),
(180, 124),
(180, 125),
(180, 126),
(180, 127),
(180, 128),
(180, 129),
(180, 130),]
and so on till 100,000 records.
I need to compare List1 with List2 and check which all tuples of List1 are there in List2 and return index of all the matching tuples with respect to List2.
I tried with for loop looping through all the tuples and returning the index of matching tuple but that was taking too much, 30 seconds.
So I tried converting both lists into sets and do intersection to find the common tuples :
set(List1) & set(List2)
Here the comparison is very fast, takes just a second to return all the matching tuples, but my requirement is to get the index of the matching tuple in List2, creating another for loop for finding index of matching tuples will be going back to square one. Also thing to be noted is that almost everytime all tuples of List1 will be in List2, so just need to find the index in quickest possible time.
Please suggest some methods or algorithm to find the same, will be really grateful. Thanks in advance!!
Instead of using a set, you can make a dictionary, which also has constant time lookups, that maps the tuples in List2
to their indexes. Then you can lookup the items in List1
in constant time to get the index making this an O(n) operation:
# maps tuples to indexes
lookup = {t:i for i, t in enumerate(List2)}
# get indexes from List2 for tuples in List1
indexes = [lookup[t] for t in List1 if t in lookup]