Search code examples
algorithmdivide-and-conquerlongest-substring

Longest Substring with no Y Divide and Conquer


Before I carry on to the problem, I should note that I know there are much easier ways to solve this problem without using divide and conquer; however, the point of solving this problem under this restriction is that I actually want to learn how to tackle problems with divide and conquer. I am good at recognizing correct solutions, but implementing my own D&C strategy is not a skill I currently have.

The problem is this: given a string, find the longest substring that does not contain the letter 'y'. For example, longestNoY("abydefyhi") should return "def".

My first approach to tackle this problem was to determine the base cases. If we had a string of length 2, we would want to return the non-y components (or empty string if both characters were 'y'). If we had a string of length 1, we would return it if it is not a 'y'.

So the first part should look like this:

def longestNoY(string, start, end):
     #Conquer
     if start == end:
          if string == 'y': return ''
          return string
     if start + 1 == end:
          if string == "yy": return ''
          if string[0] == 'y': return string[1]
          return string[0]
     ....

Next, I knew that I would need to recursively call the function for each half of the parent string. I also knew that I wanted the function to return the longer of the two children, except if the sum of the lengths of the two children was equal to the length of the parent, then the function should return the parent because there were no 'y's in the children.

    #Divide and Partial Implementation of Rejoin
    ....
    middle = (start + end) // 2
    leftString = longestNoY(string, start, middle)
    rightString = longestNoY(string, middle, end)
    if len(leftString) + len(rightString) == len(string): return string
    ....

The part I am having trouble with now would best be explained with an example:

0 1 2     3 4   5 6   7 8 
a b   y   d e | f y   h i
a b   y | d e | f y | h i
a b | y | d e | f y | h i 

The longest substring in the left side is either "ab" or "de", but we know that "de" is adjacent to an 'f' which would make "def" the longest. I don't know exactly how to carry on with this problem. Please do not give me a program to solve this problem.


Solution

  • I think the best way to put the problem is to find the positions just before and just after y, not being y. This way you will find left and right ends of intervals. I do not give you the code, since you specifically asked as not to solve the problem for you, just point to the right direction, so:

    • In trivial cases (length of interval is 0) determine whether the item you have is a valid left end or right and of an interval
    • In non-trivial cases always halve the set to left and right (no problem if the number of items is odd, just put the middle somewhere) and issue the divide and conquer for them as well
    • In non-trivial cases always consider the best interval the left and right sub-problem gives you
    • In non-trivial cases make sure that if an interval happens to start in the left and end in the right, you take that into account
    • from such intervals, the one which has a greater length is better

    These are the ideas you need to employ in order to implement the divide and conquer you desire. Happy coding!