I have a dataset like below -
List((X,Set(" 1", " 7")), (Z,Set(" 5")), (D,Set(" 2")), (E,Set(" 8")), ("F ",Set(" 5", " 9", " 108")), (G,Set(" 2", " 11")), (A,Set(" 7", " 5")), (M,Set(108)))
Here X is related to A as 7 is common between them
Z is related to A as 5 is common between them
F is related to A as 5 is common between them
M is related to F as 108 is common between them
So, X, Z, A, F and M are related
D and G are related as 2 is common between them
E is not related to anybody
So, the output would be ((X, Z, A, F, M), (D,G), (E))
Order doesn't matter here.
I have used Scala here, but solution in Scala/Python or a pseudocode would work for me.
Ultimately using a simpler solution in python as below:
data=[
["X",{"1", "7"}],
["Z",{"5",}],
["D",{"2",}],
["E",{"8",}],
["F",{"5", "9", "108"}],
["G",{"2", "11"}],
["A",{"7", "5"}],
["M",{"108"}]
]
for i in range(len(data)):
for j in range(len(data)):
if(data[i][1].intersection(data[j][1])):
if(data[i][0]!=data[j][0] ):
data[i][1] = data[j][1] = (data[i][1]).union(data[j][1])
for k, g in groupby(sorted([[sorted(tuple(d[1])),d[0]] for d in data]), key=lambda x: x[0]):
print(list(l[1] for l in g))
Getting output as :
['A', 'F', 'M', 'X', 'Z']
['D', 'G']
['E']
Tested for few more datasets and it seems to be working fine.