Search code examples
algorithmrecursionsearchdivide-and-conquer

Longest substring of alternating characters in a string


I've been going around this Divide and conquer problem of given a string made of lowcase characters, find the longest substring made of contiguous {consonant, vowel} and it's position on the string.

Example.

string = "ehbabebaehehehbe"

solution = "babeba" position: 3

I've tried a lot of things and i've found these problems.

  1. I don't know where to split the problem, cause i may split the solution, and merging the subsolutions, the solution could disappear
  2. When merging the subsolutions, i dont know if i should combine them or choose one of them because i dont know if they are contiguous

Solution

  • I have given some time to your problem. I have come up with an algorithm. Here are the details.


    Concept

    We will use two methods:

    1: int[] maxSubstringOfAlternatingCharacters(String string, int low, int high)
    

    In this method, we simply divide the string into two parts. We will be dividing the string from the middle. This method returns an array of the form {startIndex, substringLength}.

    2: int[] maxCrossingSubstring(String string, int low, int mid, int high)
    

    In this method, we find the substring of maximum length having alternate characters between low and high. This method returns an array of the form {startIndex, substringLength}.


    Algorithm

    Method 1:

    1:  int[] maxSubstringOfAlternatingCharacters(String string, int low, int high)
    2:
    3:      if (low == high) return {low, 1}
    4:      
    5:      mid = (low + high)/2
    6:      leftSubstring = maxSubstringOfAlternatingCharacters(string, low, mid)
    7:      rightSubstring = maxSubstringOfAlternatingCharacters(string, mid+1, high)
    8:      crossingSubstring = maxCrossingSubstring (string, low, mid, high)
    9:
    10:     if leftSubstring[1] >= rightSubstring[1] and leftSubstring[1] >= crossingSubstring[1]
    11:         return leftSubstring
    12:     else if rightSubstring[1] >= leftSubstring[1] and rightSubstring[1] >= crossingSubstring[1]
    13:         return rightSubstring
    14:     else
    15:         return crossingSubstring
    

    Method 2:

    1:  int[] maxCrossingSubstring (String string, int low, int mid, int high)
    2:
    3:      if char[mid] and char[mid+1] are not alternating characters
    4:          return {low, 0};
    5:
    6:      leftMaxSubstring = 1;
    7:      //Loop from mid to low to find leftMaxSubstring of alternating characters*/
    8:      rightMaxSubstring = 1;
    9:      //Loop from mid+1 to high to find rightMaxSubstring of alternating characters*/
    10:     return {mid-leftMaxSubstring+1, leftMaxSubstring+rightMaxSubstring}
    

    Time complexity

    The divide step takes O(1) time. The combine step, that is, finding the max crossing substring takes at worst O(n) time where n is high-low+1.

    The recurrence relation will be:

    T(n) = 2T(n/2) + O(n)
    

    which results in a time complexity of O(n*logn).


    Final Thoughts

    I did not provide the complete implementation of maxCrossingSubstring method. You should try to do it on your own. If you face any problem, do comment and I will complete it too. If you face any trouble understanding any part of the algorithm, do comment and I will update my answer to help you.

    I think there might be a faster algorithm to solve your problem which will run in O(n). It will be similar to the Kadane's algorithm and hence it will not be a Divide and Conquer algorithm.

    I hope I have helped you. I did implement this algorithm myself and it did work for me against many inputs. If anyone find anything wrong with it. Do comment.