Search code examples
javagraphpattern-matchingsubgraph

Any library or proposed solution for subgraph (path) matching within a graph?


For example, there is a graph, which can be represented as an adjacency matrix as

G = {{ 0, 1, 0 }, { 1, 0, 1 }, { 1, 0, 0 }}

Thus, there are four directed edges:

node_1 to node_2, node_2 to node_3, node_2 to node_1 and node_3 to node_1.

What I want is to calculate the similarity between subgraph (path) {node_2 to node_3} and subgraph (path) {node_2 to node_3 to node_1}.

What I can found the most is the subgraph isomorphism problem, which trying to determine if a subgraph matches (is a part of) a larger graph. This is not my desire.

My major task is to determine how similar two subgraphs (path) are, which both exist within a graph that I known.

Any existing approaches you could recommend? Papers? Example code?

Thanks in advance.


Solution

  • The Levenshtein distance measures the difference between two sequences by counting the number of single-element editions needed to change one sequence into the other.

    If you store the vertex sequence of each path in a list or array you can calculate the distance using the following implementation (assuming you have a Vertex type):

    int levenshteinDistance(List<Vertex> path, int lenPath, List<Vertex> other, int lenOther) {
        if (path.equals(other)) return 0;
        if (lenPath == 0) return lenOther;
        if (lenOther == 0) return lenPath;
    
        int cost;
        if (path.get(lenPath - 1).equals(other.get(lenOther - 1))) {
            cost = 0;
        } else {
            cost = 1;
        }
    
        int dist1 = levenshteinDistance(path, lenPath - 1, other, lenOther) + 1;
        int dist2 = levenshteinDistance(path, lenPath, other, lenOther - 1) + 1;
        int dist3 = levenshteinDistance(path, lenPath - 1, other, lenOther - 1) + cost;
        return Math.min(dist1, Math.min(dist2, dist3));
    }
    

    This is inefficient, though, as it recomputes the distance for many subsequences. A more efficient implementation can be found at http://rosettacode.org/wiki/Levenshtein_distance#Java. Note that it uses String as input, but it should be straightforward to reimplement it to use arrays.