Search code examples
pythonmatlabnumpyinitializationvariable-assignment

Initialize Multiple Numpy Arrays (Multiple Assignment) - Like MATLAB deal()


I was unable to find anything describing how to do this, which leads to be believe I'm not doing this in the proper idiomatic Python way. Advice on the 'proper' Python way to do this would also be appreciated.

I have a bunch of variables for a datalogger I'm writing (arbitrary logging length, with a known maximum length). In MATLAB, I would initialize them all as 1-D arrays of zeros of length n, n bigger than the number of entries I would ever see, assign each individual element variable(measurement_no) = data_point in the logging loop, and trim off the extraneous zeros when the measurement was over. The initialization would look like this:

[dData gData cTotalEnergy cResFinal etc] = deal(zeros(n,1));

Is there a way to do this in Python/NumPy so I don't either have to put each variable on its own line:

dData = np.zeros(n)
gData = np.zeros(n)
etc.

I would also prefer not just make one big matrix, because keeping track of which column is which variable is unpleasant. Perhaps the solution is to make the (length x numvars) matrix, and assign the column slices out to individual variables?

EDIT: Assume I'm going to have a lot of vectors of the same length by the time this is over; e.g., my post-processing takes each log file, calculates a bunch of separate metrics (>50), stores them, and repeats until the logs are all processed. Then I generate histograms, means/maxes/sigmas/etc. for all the various metrics I computed. Since initializing 50+ vectors is clearly not easy in Python, what's the best (cleanest code and decent performance) way of doing this?


Solution

  • If you're really motivated to do this in a one-liner you could create an (n_vars, ...) array of zeros, then unpack it along the first dimension:

    a, b, c = np.zeros((3, 5))
    print(a is b)
    # False
    

    Another option is to use a list comprehension or a generator expression:

    a, b, c = [np.zeros(5) for _ in range(3)]   # list comprehension
    d, e, f = (np.zeros(5) for _ in range(3))   # generator expression
    print(a is b, d is e)
    # False False
    

    Be careful, though! You might think that using the * operator on a list or tuple containing your call to np.zeros() would achieve the same thing, but it doesn't:

    h, i, j = (np.zeros(5),) * 3
    print(h is i)
    # True
    

    This is because the expression inside the tuple gets evaluated first. np.zeros(5) therefore only gets called once, and each element in the repeated tuple ends up being a reference to the same array. This is the same reason why you can't just use a = b = c = np.zeros(5).

    Unless you really need to assign a large number of empty array variables and you really care deeply about making your code compact (!), I would recommend initialising them on separate lines for readability.