I have two elements, x
and Y
. x
is a very long list of characters. Y
is an even longer list, such that every element of it is a set. For every element in x
, I would like to find the index of the set in Y
that contains it.
x
are covered by the sets in Y
.y
in Y
such that it contains an element that is not on x
.Y = [{"a", "b"}, {"c", "d"}, {"e", "f"}]
x = ["a", "b", "c", "d", "e", "f", "a", "f"]
# Desired results:
[('a', 0),
('b', 0),
('c', 1),
('d', 1),
('e', 2),
('f', 2),
('a', 0),
('f', 2)]
Frankly, I am ok with any shape of such results, as long as the computation is efficient.
You can create an index first to speed up the search:
Y = [{"a", "b"}, {"c", "d"}, {"e", "f"}]
x = ["a", "b", "c", "d", "e", "f", "a", "f"]
index = {v: i for i, s in enumerate(Y) for v in s}
out = [(v, index[v]) for v in x]
print(out)
Prints:
[("a", 0), ("b", 0), ("c", 1), ("d", 1), ("e", 2), ("f", 2), ("a", 0), ("f", 2)]