Search code examples
pythonarrayslistloopsemcee

Creating list of individual list items multiplied n times


I'm fairly new to Python, and think this should be a fairly common problem, but can't find a solution. I've already looked at this page and found it helpful for one item, but I'm struggling to extend the example to multiple items without using a 'for' loop. I'm running this bit of code for 250 walkers through Emcee, so I'm looking for the fastest way possible.

I have a list of numbers, a = [x,y,z] that I want to repeat b = [1,2,3] times (for example), so I end up with a list of lists:

[
 [x],
 [y,y],
 [z,z,z]
]

The 'for' loop I have is:

c = [ ]
for i in range (0,len(a)):
    c.append([a[i]]*b[i])

Which does exactly what I want, but means my code is excruciatingly slow. I've also tried naively turning a and b into arrays and doing [a]*b in the hopes that it would multiply element by element, but no joy.


Solution

  • You can use zip and a list comprehension here:

    >>> a = ['x','y','z']
    >>> b = [1,2,3]
    >>> [[x]*y for x,y in zip(a,b)]
    [['x'], ['y', 'y'], ['z', 'z', 'z']]
    

    or:

    >>> [[x for _ in xrange(y)] for x,y in zip(a,b)]
    [['x'], ['y', 'y'], ['z', 'z', 'z']]
    

    zip will create the whole list in memory first, to get an iterator use itertools.izip

    In case a contains mutable objects like lists or lists of lists, then you may have to use copy.deepcopy here because modifying one copy will change other copies as well.:

    >>> from copy import deepcopy as dc
    >>> a = [[1 ,4],[2, 5],[3, 6, 9]]
    >>> f = [[dc(x) for _ in xrange(y)] for x,y in zip(a,b)]
    
    #now all objects are unique
    >>> [[id(z) for z in x] for x in f]
    [[172880236], [172880268, 172880364], [172880332, 172880492, 172880428]]
    

    timeit comparisons(ignoring imports):

    >>> a = ['x','y','z']*10**4
    >>> b = [100,200,300]*10**4
    
    >>> %timeit [[x]*y for x,y in zip(a,b)]
    1 loops, best of 3: 104 ms per loop
    
    >>> %timeit [[x]*y for x,y in izip(a,b)]
    1 loops, best of 3: 98.8 ms per loop
    
    >>> %timeit map(lambda v: [v[0]]*v[1], zip(a,b))
    1 loops, best of 3: 114 ms per loop
    
    >>> %timeit map(list, map(repeat, a, b))
    1 loops, best of 3: 192 ms per loop
    
    >>> %timeit map(list, imap(repeat, a, b))
    1 loops, best of 3: 211 ms per loop
    
    >>> %timeit map(mul, [[x] for x in a], b)
    1 loops, best of 3: 107 ms per loop
    
    >>> %timeit [[x for _ in xrange(y)] for x,y in zip(a,b)]
    1 loops, best of 3: 645 ms per loop
    
    >>> %timeit [[x for _ in xrange(y)] for x,y in izip(a,b)]
    1 loops, best of 3: 680 ms per loop