Search code examples
python-3.xexecution-time

does sum() take the execution time of a loop in python3


I used sum() in a loop and it take more than 4s to execute, is the sum() equal to nested loop here?

def arrayMaxConsecutiveSum(inputArray, k):
    cop=k
    lis=[]
    i=0
    while i < len(inputArray):
        inpu=sum(inputArray[i:cop])
        lis.append(inpu)
        cop+=1
        i+=1
    return max(lis)

Solution

  • sum is implemented as a C++ library, so it is still going to be faster than a Python for-loop, although you're still running nested loops either way. To illustrate, I timed your code here, as well as a modified code that uses a Python loop instead:

    def arrayMaxConsecutiveSumLoop(inputArray, k):
        cop=k
        lis=[]
        i=0
        while i < len(inputArray) - k:
            inpu = 0
            for j in range(i, cop):  # for-loop in place of sum
                inpu += inputArray[j]
            lis.append(inpu)
            cop += 1
            i += 1
        return max(lis)
    

    The timings were:

    arrayMaxConsecutiveSum(np.random.rand(10000), 100)
    111 ms ± 1.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    
    arrayMaxConsecutiveSumLoop(np.random.rand(10000), 100)
    198 ms ± 4.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    

    so the implementation with sum (your version) is faster by about a factor of two!

    Better implementation

    I rewrote your function using numpy to make it way faster! This implementation has no nested loops and is O(n), as well as being implemented with the very fast numpy libraries.

    import numpy as np
    def array_max_consecutive_sum(input_array, k):
        result = np.cumsum(input_array)
        return max(result[k:] - result[:-k])
    
    array_max_consecutive_sum(np.random.rand(10000), 100)
    688 µs ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    

    That's more than 150 times faster!

    Other options

    Incidentally, the accepted answer (as of this writing) gives timing:

    6.46 ms ± 147 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    Fast, but still only a tenth as fast as the numpy implementation above. Also, the result is wrong.