Search code examples
pythonarraysnumpy2d

Pythonic way of creating 2D numpy array


I have a function gen() which returns a numpy array of nElements number of floats. I'm looking for a more Pythonic (one liner?) way to do the following:

a = zeros((nSamples, nElements))
for i in xrange(nSamples):
     a[i,:] = gen()

This is one way to do it:

a = array([gen() for i in xrange(nSamples)]).reshape((nSamples, nElements))

But it understandably is a bit slower on account of not pre-allocating the numpy array:

import time
from numpy import *

nSamples  = 100000
nElements = 100

start = time.time()
a = array([gen() for i in xrange(nSamples)]).reshape((nSamples, nElements))
print (time.time() - start)

start = time.time()
a = zeros((numSamples, nElements))
for i in xrange(numSamples):
    a[i,:] = gen()
print (time.time() - start)

Output:

1.82166719437
0.502261161804

Is there a way to achieve the same one-liner while keeping the preallocated array for speed?


Solution

  • i believe this will do what you want:

    a = vstack([ gen() for _ in xrange(nSamples) ])
    

    as i don't have access to your gen function, i can't do timing tests. also, this (as well as your one-liner) are not as memory-friendly as your for loop version. the one-liners store all gen() outputs and then construct the array, whereas the for loop only needs to have one gen() in memory at a time (along with the numpy array).