Search code examples
pythonarraysfunctionnumpyin-place

Altering numpy function output array in place


I'm trying to write a function that performs a mathematical operation on an array and returns the result. A simplified example could be:

def original_func(A):
    return A[1:] + A[:-1]

For speed-up and to avoid allocating a new output array for each function call, I would like to have the output array as an argument, and alter it in place:

def inplace_func(A, out):
    out[:] = A[1:] + A[:-1]

However, when calling these two functions in the following manner,

A = numpy.random.rand(1000,1000)
out = numpy.empty((999,1000))

C = original_func(A)

inplace_func(A, out)

the original function seems to be twice as fast as the in-place function. How can this be explained? Shouldn't the in-place function be quicker since it doesn't have to allocate memory?


Solution

  • I think that the answer is the following:

    In both cases, you compute A[1:] + A[:-1], and in both cases, you actually create an intermediate matrix.

    What happens in the second case, though, is that you explicitly copy the whole big newly allocated array into a reserved memory. Copying such an array takes about the same time as the original operation, so you in fact double the time.

    To sum-up, in the first case, you do:

    compute A[1:] + A[:-1] (~10ms)
    

    In the second case, you do

    compute A[1:] + A[:-1] (~10ms)
    copy the result into out (~10ms)