Search code examples
javaalgorithmknuth-morris-pratt

Java: KMP matcher algorithm


enter image description here

I am trying to implement the above algorithm in Java. However I am getting an out of bounds exception and I don't know how to fix this.

I am just translating the psuedocode pretty much line by line.

Here is the code:

public static int[] computePrefixFunction(String input) 
    {
        int[] pi = new int[input.length()];
        int k = 0;
        for (int q = 1; q < input.length(); q++) {
            char target = input.charAt(q);
        while (k > 0 && input.charAt(k) != target) k = pi[k - 1];
        if (input.charAt(k) == target) k++;
        pi[q] = k;
    }
    return pi;
}

public static Queue<Integer> KMPMatcher(String T, String P) 
{
    int n = T.length();
    int m = P.length();
    int[] pi = computePrefixFunction(P);
    int q = 0;
    Queue<Integer> Q = new LinkedList<>();
    for(int i = 0; i < n; i++) 
    {
        while(q > 0 && P.charAt(q+1) != T.charAt(i)) 
            q = pi[q];
        if(P.charAt(q+1) == T.charAt(i)) 
            q = q + 1;
        if(q == m-1) // you match it when q reaches size of pattern -1. :)
        {
            Q.add(i-m+1); // Change it as well.
            q = pi[q];
        }     
    }
    return Q;
}

public static void main(String[] args) {
    System.out.println(KMPMatcher("bdacabdacb","bda"));
}

Edit: I have updated the code with piyush implementation below which corrected a few of my problems. However there is another problem.

I tested KMPMatcher using these:

1) System.out.println(KMPMatcher("bacabab","bab")); // returned[2,4]

2) System.out.println(KMPMatcher("bdacabdacb","bab")); // returned[3]

Number 1 should only return 4 and number 2 should only return an empty list. Why is this happening? I am trying to draw at the trace with these inputs and compare it with the psuedocode. I think it something to do with the indexing in the if(q==m-1)(because its not comparing the right thing compared to the psuedocode version?) and I'm not sure how to fix it. Any help please?


Solution

  • The problem is in if statement. It should not be if (q == m-1).

    public static int[] computePrefixFunction(String input) 
        {
            int[] pi = new int[input.length()];
            int k = 0;
            for (int q = 1; q < input.length(); q++) {
                char target = input.charAt(q);
            while (k > 0 && input.charAt(k) != target) k = pi[k - 1];
            if (input.charAt(k) == target) k++;
            pi[q] = k;
        }
        return pi;
    }
    
       public static Queue<Integer> KMPMatcher(String T, String P) 
    {
        int n = T.length();
        int m = P.length();
        int[] pi = computePrefixFunction(P);
        int q = 0;
        Queue<Integer> Q = new LinkedList<>();
        for(int i = 0; i < n; i++) 
        {
            while(q > 0 && P.charAt(q) != T.charAt(i)) 
                q = pi[q-1];
            if(P.charAt(q) == T.charAt(i)) 
                q++;
            {
                Q.add(i-q+1); // Change it.
                q = pi[q-1];
            }     
        }
        return Q;
    }
    
    public static void main(String[] args) {
        System.out.println(KMPMatcher("bdacabdacb","bda"));
    }