Search code examples
algorithmgraphclassificationknnvertices

Algorithm for classification of vertices in a graph based on their weight


I have the following complete weighted graph, where each weight represents the probability of a vertice belonging to the same category as the next. I know a priori the category for which some of the vertices belong to; how would I be able to classify every other vertice?

image

In a more detailed manner I can describe the problem as following; From all the vertices N and clusters C, we have a set where we know for sure the specific cluster which a node belongs: P(v_n|C_n)=1. From the graph given we also know for each node, the probability of every other belonging to the same cluster as it: P(v_n1∩C_n2). From this, how can we estimate the cluster for every other node?


Solution

  • Let w_i be a vector where w_i[j] is the probability of node j, being in the cluser, at iteration i.

    We define w_i:

    w_0[j] =  1       j is given node in the class
              0       otherwise
    w_{i}[j] = P(j | w_{i-1})
    

    Where: P(j | w_{i-1}) is the probability j being in the cluster, assuming we know the probabilities for each other node k to be in it, as w_{i-1}[k].

    We can calculate the above probability:

    P(j | w_{i-1}) = 1- (1- w_{i-1}[0]*c(0,j))*(1- w_{i-1}[1]*c(1,j))*...*(1- w_{i-1}[n-1]*c(n-1,j))
    

    in here:

    • w_{i-1} is the output of last iteration.
    • c(x,y) is the weight of edge (x,y)
    • c(x,x) = 1

    Repeat until convergence, and in the converged vector (let it be w), the probability of j being in the cluster is w[j]


    Explanation for the probability function:

    In order for a node NOT to be in the set, it needs all the other nodes will "decide" not to share it.
    So, the probability for that happening is:

    (1- w_{i-1}[0]*c(0,j))*(1- w_{i-1}[1]*c(1,j))*...*(1- w_{i-1}[n-1]*c(n-1,j))
          ^                            ^                       ^
    node 0 doesn't share      node 1 doesn't share     node n-1 doesn't share
    

    In order to be in the class, at least one node need to "share", so the probability for that happening is the complemantory, which is the formula we derived for P(j | w_{i-1})