Opposite of numpy.reduce

So in numpy we have a reduce function with which we can reduce one dimension of an array by applying a function to the elements across that dimension. Is there also an inverse of this function that would take a single element and expand it to a whole new dimension?

Let's say I have these two arrays:

class A:
    def __init__(self, a, b, c):
        self.value = a, b, c

a = np.array([A(1,2,3), A(4,5,6)])
b = np.array([1<<8 | 2<<4 | 3, 4<<8 | 5<<4 | 6])

And I would like to transform either of these two arrays into

np.array([[1,2,3],[4,5,6]])

What I'm currently doing is the following:

def expand(arr):
    a = np.empty((*arr.shape, 3))
    for x, y in np.ndindex(arr.shape):
        if isinstance(arr[x,y], A):
            a[x,y] = arr[x,y].value
        else:
            a[x,y] = arr[x,y] >> 8, arr[x,y] >> 4 & 0xF, arr[x,y] & 0xF
    return a

This works, but I'd like to avoid (slow?) iteration since it goes against the spirit of numpy.

I've also tried a solution using np.vectorize, but it doesn't work as I'd want it to:

def expanded(element):
    if isinstance(element, A):
        return element.value
    return element >> 8, element >> 4 & 0xF, element & 0xF
f = np.vectorize(expanded)
f(a)  # Prints a tuple of arrays instead of the desired single array

Is there a better way to expand a single value to a new dimension, either via some mathematical operation or via object attribute access?

Solution

Your array of A instances.

In [56]: a
Out[56]: 
array([<__main__.A object at 0x000001BE0C814220>,
       <__main__.A object at 0x000001BE138B8610>], dtype=object)

Define a __repr__ to get a prettier display.

Iterate on the instances, returning the value:

In [59]: [np.array(i.value) for i in a]
Out[59]: [array([1, 2, 3]), array([4, 5, 6])]

Which can be turned into one array with:

In [60]: np.vstack(_)
Out[60]: 
array([[1, 2, 3],
       [4, 5, 6]])

In [61]: np.array(__)
Out[61]: 
array([[1, 2, 3],
       [4, 5, 6]])

vectorize makes an array for each value returned for the instance, here 3 arrays:

In [62]: np.vectorize(lambda i: i.value)(a)
Out[62]: (array([1, 4]), array([2, 5]), array([3, 6]))

which can be turned to a transpose of the previous array:

In [63]: np.array(_)
Out[63]: 
array([[1, 4],
       [2, 5],
       [3, 6]])

Returning an array (instead of tuple), and specifying otypes, gives another object dtype array:

In [64]: np.vectorize(lambda i: np.array(i.value), otypes=[object])(a)
Out[64]: array([array([1, 2, 3]), array([4, 5, 6])], dtype=object)
In [65]: np.vstack(_)
Out[65]: 
array([[1, 2, 3],
       [4, 5, 6]])

signature can produce the numeric array directly. I don't think this is any faster.

In [66]: np.vectorize(lambda i: np.array(i.value), signature='()->(n)')(a)
Out[66]: 
array([[1, 2, 3],
       [4, 5, 6]])

A slightly more primative version of vectorize returns an object dtype array directly:

In [68]: np.frompyfunc(lambda i: np.array(i.value), 1,1)(a)
Out[68]: array([array([1, 2, 3]), array([4, 5, 6])], dtype=object)

Comparative timings on this small example wont tell us much. You need to explore a more realistic size a.

The fast numpy code works with numeric dtypes. For object dtype speeds are all approximately equivalent to list comprehensions.