pythonpython-3.xloopstime-complexitybig-o

How is it possible for Python to take more time for a single loop than multiple?


I have 2 codes which do similar task:

class Solution1:
    def f(self, s: str) -> bool:
        def calculate(s, i, digit, length, maxi):
            count = 1
            while i < length:
                if s[i] == digit: count += 1
                else: break
                i += 1
            
            return max(maxi, count), i

        
        maxi1 = 0
        maxi0 = 0
        length = len(s)
        i = 0
        while i<length:
            if s[i] == '0': maxi0, i = calculate(s,i,'0', length, maxi0)
            else: maxi1, i = calculate(s,i,'1', length, maxi1)
        
        return maxi1 > maxi0


class Solution3:
    def f(self, s: str) -> bool:
        s1 = s.split('0')
        s0 = s.split('1')
        r1 = max([len(i) for i in s1])
        r0 = max([len(i) for i in s0])
        return r1>r0

Running the following codes with %timeit,

x = '1100011111100111010111'*10

sol1 = Solution1().f
sol3 = Solution3().f

% timeit -r 30 -n 3000 sol1(x,)
% timeit -r 30 -n 3000 sol3(x,)

it gives me,

3000 loops, best of 30: 77.5 µs per loop
3000 loops, best of 30: 29.6 µs per loop

it is weird that the second code takes almost half time.

split() iterates over the loop once and has been used twice, then len takes the O(N) time for each subpart and has been called for every sub string meaning a total of O(N) time and on top of that and max works exactly the same for both. Can someone explain how this might be happening? Is it the cpython doing something under the hood for inbuilt functions?

and to my surprise:

class Solution4:
    def f(self, s: str) -> bool:
        return len(max(s.split('0'))) > len(max(s.split('1')))

takes 7x less time given max is used on strings + split() twice.


Solution

  • The answer to the question how is it possible that one code is slower than the other is already provided in the comments by kindall. Here for the sake of completeness once again:

    Built-in functions like max() are written in C and are much faster than the equivalent Python code. Then your Solution example doesn't strictly do the same as the other two (it checks for the largest string, not the string with the largest length). In this case they are functionally equivalent (as a string consisting of five zeroes is "greater" than a string consisting of three) but comparing two strings is (as you can tell) faster than two function calls and a comparison of their results. – kindall

    There a two ways of getting things done faster:

    • optimize the code of the used method
    • choose a totally different method

    So instead of wondering how it comes that doing same things using another functions/methods/modules can significant cut execution time down, let's be surprised that choosing another approach/method for solving same problem can do it too. Below the code of such another approach and explanation about the core of the idea behind the solution:

    There are often the most obvious things one don't think about searching for a solution. The discussion of the same task here Counting longest occurrence of repeated sequence in Python does surprisingly not mention the below provided approach.

    The 'Solution 5' called approach goes the other way around as to search for a maximum substring length in a generated list of all substrings of a string.

    It searches for the first occurrence of a substring iterating over increasing substring length. Usage of highly optimized search for a substring and looking only for its first occurrence results in taking less time compared to Solution 4 in case of longer strings and extremely less time for very long strings (for very short strings f4() seems to be faster).

    So let's compare the fastest solution f4() to the another approach to the same problem f5().

    class Solution4:
        def f4(self, s: str) -> bool:
            return len(max(s.split('0'))) > len(max(s.split('1')))
    
    def f4(s: str) -> bool: # cut out time for class overhead
        return len(max(s.split('0'))) > len(max(s.split('1')))
    
    def f5(s):
        foundMax0=False; foundMax1=False
        for l in range(1,len(s)+1):
            if not foundMax0: 
                # print(f'{l=}, {s.find("0"*l)=}') 
                if s.find("0"*l) is -1: 
                    max0=l-1; foundMax0=True
            if not foundMax1: 
                # print(f'{l=}, {s.find("1"*l)=}') 
                if s.find("1"*l) is -1: 
                    max1=l-1; foundMax1=True
            if foundMax0 and foundMax1:
                break
        return max1 > max0
    
    from time import perf_counter as T
    s = '1100011111100111010111'*10 # *1000
    sT=T(); sol4 =Solution4().f4(s); eT=T(); print(f'{eT-sT:11.8f}')
    sT=T(); sol4f=            f4(s); eT=T(); print(f'{eT-sT:11.8f}')
    sT=T(); sol5 =            f5(s); eT=T(); print(f'{eT-sT:11.8f}')
    print( f5(s) )
    # Times vary, but the last solution is more than ten times faster than the up to now fastest one on a large string and still 1.5 times faster on a small one:  
    # s*10   : 0.00002643
    # s*10   : 0.00001687
    # s*10   : 0.00000914
    # s*1000 : 0.00116592
    # s*1000 : 0.00104705
    # s*1000 : 0.00007862
    

    There is much room for optimizing the code above. Two approaches, f(6) and f(7) to such optimizing were proposed in the comments by Kelly Bundy:

    def f5(s):
        foundMax0=False; foundMax1=False
        for l in range(1,len(s)+1):
            if not foundMax0:
                if s.find("0"*l) is -1: 
                    max0=l-1; foundMax0=True
            if not foundMax1:
                if s.find("1"*l) is -1: 
                    max1=l-1; foundMax1=True
            if foundMax0 and foundMax1:
                break
        return max1 > max0
        
    def f6(s):
        s0, s1 = '0', '1'
        while s1 in s:
            if s0 not in s:
                return True
            s0 += '0'
            s1 += '1'
        return False
    
    def f7(s):
        n = 1
        while '1'*n in s:
            if '0'*n not in s:
                return True
            n += 1
        return False
    
    import random
    print(all(f7(s)==f6(s)==f7(s) for _ in range(1000) for s in [''.join(random.choices('01', k=1000))]))
    
    from time import perf_counter as T
    #s = '1100011111100111010111'*10 # *1000
    s=''.join(random.choices('01', k=1000000))
    sT=T(); sol5 =            f5(s); eT=T(); print(f't5={eT-sT:11.8f}')
    sT=T(); sol6 =            f6(s); eT=T(); print(f't6={eT-sT:11.8f}')
    sT=T(); sol7 =            f7(s); eT=T(); print(f't7={eT-sT:11.8f}')
    

    The speedup of code using f6(s) and f7(s) is achieved by not obtaining the maximum values for both zeroes and ones run lengths stopping the search when only one maximum was found.

    Anyway, the general message about any kind of further speed optimization is as follows:

    Wondering about compared timings of loops consider that changing the data these loops work on can have significant impact on the final runtime, too.