Search code examples
pythonstringalphabetical

Finding the longest alphabetical substring in a longer string


This code finds the longest alphabetical substring in a string (s).

letter = s[0]      
best = ''          
for n in range(1, len(s)):    
    if len(letter) > len(best):
        best = letter
    if s[n] >= s[n-1]:
        letter += s[n]    
    else:                      
        letter = s[n]

It works most of the time, but occasionally it gives wrong answers and I am confused why it only works sometimes. for example when:

s='maezsibmhzxhpprvx'

It said the answer was "hpprv" while it should have been "hpprvx".

In another case, when

s='ysdxvkctcpxidnvaepz'

It gave the answer "cpx", when it should have been "aepz".

Can anyone make sense of why it does this?


Solution

  • Your routine was almost ok, here's a little comparison between yours, the fixed one and another possible solution to your problem:

    def buggy(s):
        letter = s[0]
        best = ''
        for n in range(1, len(s)):
            if len(letter) > len(best):
                best = letter
            if s[n] >= s[n - 1]:
                letter += s[n]
            else:
                letter = s[n]
    
        return best
    
    
    def fixed(s):
        letter = s[0]
        best = ''
        for n in range(1, len(s)):
            if s[n] >= s[n - 1]:
                letter += s[n]
            else:
                letter = s[n]
    
            if len(letter) > len(best):
                best = letter
    
        return best
    
    
    def alternative(s):
        result = ['' for i in range(len(s))]
        index = 0
    
        for i in range(len(s)):
            if (i == len(s) - 1 and s[i] >= s[i - 1]) or s[i] <= s[i + 1]:
                result[index] += s[i]
            else:
                result[index] += s[i]
                index += 1
    
        return max(result, key=len)
    
    
    for sample in ['maezsibmhzxhpprvx', 'ysdxvkctcpxidnvaepz']:
        o1, o2, o3 = buggy(sample), fixed(sample), alternative(sample)
        print "buggy={0:10} fixed={1:10} alternative={2:10}".format(o1, o2, o3)
    

    As you can see in your version the order of the inner loop conditional is not good, you should move the first conditional to the end of loop.