Search code examples
algorithmpattern-matchingstring-matchingsuffix-array

How do we Construct LCP-LR array from LCP array?


To find the number of occurrences of a given string P ( length m ) in a text T ( length N )

We must use binary search against the suffix array of T.

The issue with using standard binary search ( without the LCP information ) is that in each of the O(log N) comparisons you need to make, you compare P to the current entry of the suffix array, which means a full string comparison of up to m characters. So the complexity is O(m*log N).

The LCP-LR array helps improve this to O(m+log N). know more

How we precompute LCP-LR array from LCP array?

And How does LCP-LR help in finding the number of occurrences of a pattern?

Please Explain the Algorithm with Example

Thank you


Solution

  • // note that arrSize is O(n)
    // int arrSize = 2 * 2 ^ (log(N) + 1) + 1; // start from 1
    
    // LCP = new int[N];
    // fill the LCP...
    // LCP_LR = new int[arrSize];
    // memset(LCP_LR, maxValueOfInteger, arrSize);
    // 
    
    // init: buildLCP_LR(1, 1, N);
    // LCP_LR[1] == [1..N]
    // LCP_LR[2] == [1..N/2]
    // LCP_LR[3] == [N/2+1 .. N]
    
    // rangeI = LCP_LR[i]
    //   rangeILeft  = LCP_LR[2 * i]
    //   rangeIRight = LCP_LR[2 * i + 1]
    // ..etc
    void buildLCP_LR(int index, int low, int high)
    {
        if(low == high)
        {
            LCP_LR[index] = LCP[low];
            return;
        }
    
        int mid = (low + high) / 2;
    
        buildLCP_LR(2*index, low, mid);
        buildLCP_LR(2*index+1, mid + 1, high);
    
        LCP_LR[index] = min(LCP_LR[2*index], LCP_LR[2*index + 1]);
    }
    

    Reference: https://stackoverflow.com/a/28385677/1428052