Search code examples
pythonalgorithmscalagraph

Discover relationship between the entities


I have a dataset like below -

List((X,Set(" 1", " 7")), (Z,Set(" 5")), (D,Set(" 2")), (E,Set(" 8")), ("F ",Set(" 5", " 9", " 108")), (G,Set(" 2", " 11")), (A,Set(" 7", " 5")), (M,Set(108)))

Here X is related to A as 7 is common between them

Z is related to A as 5 is common between them

F is related to A as 5 is common between them

M is related to F as 108 is common between them

So, X, Z, A, F and M are related

D and G are related as 2 is common between them

E is not related to anybody

So, the output would be ((X, Z, A, F, M), (D,G), (E))

Order doesn't matter here.

I have used Scala here, but solution in Scala/Python or a pseudocode would work for me.


Solution

  • Ultimately using a simpler solution in python as below:

    data=[
    ["X",{"1", "7"}],
    ["Z",{"5",}],
    ["D",{"2",}],
    ["E",{"8",}],
    ["F",{"5", "9", "108"}],
    ["G",{"2", "11"}],
    ["A",{"7", "5"}],
    ["M",{"108"}]
    ]
    
    for i in range(len(data)):
        for j in range(len(data)):
            if(data[i][1].intersection(data[j][1])):
                if(data[i][0]!=data[j][0] ):
                    data[i][1] = data[j][1] = (data[i][1]).union(data[j][1])
    for k, g in groupby(sorted([[sorted(tuple(d[1])),d[0]] for d in data]), key=lambda x: x[0]):
        print(list(l[1] for l in g))
    

    Getting output as :

    ['A', 'F', 'M', 'X', 'Z']
    ['D', 'G']
    ['E']
    

    Tested for few more datasets and it seems to be working fine.