Search code examples
pythonperformancenumpymemory

Should I preallocate a numpy array?


I have a class and its method. The method repeats many times during execution. This method uses a numpy array as a temporary buffer. I don't need to store values inside the buffer between method calls. Should I create a member instance of the array to avoid time leaks on memory allocation during the method execution? I know that it is preferred to use local variables. But is Python smart enough to allocate memory for the array only once?

class MyClass:
    def __init__(self, n):
        self.temp = numpy.zeros(n)
    def method(self):
        # do some stuff using self.temp

Or

class MyClass:
    def __init__(self, n):
        self.n = n
    def method(self):
        temp = numpy.zeros(self.n)
        # do some stuff using temp

Update: replaced np.empty with np.zeros


Solution

  • Yes, you need to preallocate large arrays. But if this will be efficient depends on how you use these arrays then.

    This will cause several new allocations for intermediate results of computation:

    self.temp = a * b + c
    

    This will not (if self.x is preallocated):

    numpy.multiply(a, b, out=self.x)
    numpy.add(c, self.x, out=self.temp)
    

    But for these cases (when you work with large arrays in not-trivial formulae) it is better to use numexpr or einsum for matrix calculations.