Search code examples
algorithmsubstringlis

Longest increasing substring with gap


I've encountered a problem specified as follows:

Let A be a sequence of positive integers.
Let B be a substring of A.
Let C be a sequence created by removing B from A.
For given A, find the length of the longest increasing (strictly) substring of C, where B can be chosen arbitrarily.

For example let A = [3 2 5 7 1 2 8 1]. If we set B = [1 2], then C = [3 2 5 7 8 1] and its longest increasing substring is [2 5 7 8], which length is 4. 4 is the answer since there exist no other B which would lead to a better solution.

I can't find an algorithm to solve the problem (in polynomial time, of course :) ), but I belive it would be some variation of the longest increasing subsequence problem.
Please help me to find a good algorithm or give me some hints or references.


Solution

  • While doing a single iteration through the input array:

    • Set up an array smallest[n], where smallest[i] represents the smallest element which an increasing substring of length i can end with (e.g. if smallest[3] = 5, that means there's a substring of length 3 ending with a 5, and there is no substring of length 3 ending with a 4, otherwise smallest[3] will be 4).

      We can keep track of the longest substring i so far, and simply replace smallest[i] if that element is bigger than the current element.

      An important notes about this array: the elements in this array will be in strictly increasing order, that is to say, if a substring of length i ending in element x exists in the array, there is no longer substring containing an element equal to or less than x (this is because the longer substring will contain an substring of length i ending in an element less than x, thus smallest[i] will be that element instead of x).

    • In addition to this array, keep a binary search tree (BST) that maps elements to substring lengths (essentially the opposite of the array).

      When updating smallest, also remove the old element from the BST and insert the new one.

      (All of this so far were about substrings in the original array A, not the array post-removal C)

    • Using this, we can find the longest substring longestSSAfterB in C ending at any element (directly following some B) by looking up the biggest element smaller than that element in the BST and adding 1 to that length.

    • The longest substring in C ending at any given element will simply be the maximum of 1 + the longest substring ending in the previous element (if it's smaller, 0 otherwise) and longestSSAfterB.

      The longest substring in C will simply be the longest substring we found above.

    All of this will take O(n log n).


    Example:

    A = [3 2 5 7 1 2 8 1]
                       BST.floor(i)+1
            currentSS  longestSSAfterB  longestSSinC  smallest BST
    A[0]=3  1          0+1=1            max(1,0+1)=1  [3]      [(3→1)]
    A[1]=2  1          0+1=1            max(1,0+1)=1  [2]      [(2→1)]
    A[2]=5  2          (2→1)->1+1=2     max(2,1+1)=2  [2,5]    [(2→1), (5→2)]
    A[3]=7  3          (5→2)->2+1=3     max(3,2+1)=2  [2,5,7]  [(2→1), (5→2), (7→3)]
    A[4]=1  1          0+1=1            max(1,0+1)=1  [1,5,7]  [(1→1), (5→2), (7→3)]
    A[5]=2  2          (1→1)->1+1=2     max(2,1+1)=2  [1,2,7]  [(1→1), (2→2), (7→3)]
    A[6]=8  3          (7→3)->3+1=4     max(4,2+1)=4  [1,2,7]  [(1→1), (2→2), (7→3)]
    A[7]=1  1          0+1=1            max(1,0+1)=1  [1,5,7]  [(1→1), (5→2), (7→3)]
    
    Longest substring = max(longestSSinC) = 4