I have two arrays of arrays:
array1 = np.array([np.array([1, 2, 3]), np.array([4, 5, 6]), np.array([7, 8, 9])],dtype=object)
array2 = np.array([np.array([9, 8]), np.array([0]), np.array([12])],dtype=object)
I need to create a new array of arrays, by appending corresponding individual arrays. I should get:
array_final=np.array([np.array([1,2,3,9,8]),np.array([4,5,6,0]),np.array([7,8,9,12])],dtype=object)
Which is the fastest way? I need to do this operation million of times.
Your 2 arrays:
In [2]: array1 = np.array([np.array([1, 2, 3]), np.array([4, 5, 6]), np.array([7, 8, 9])],dtype=object)
...: array2 = np.array([np.array([9, 8]), np.array([0]), np.array([12])],dtype=object)
But look at the first:
In [3]: array1
Out[3]:
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], dtype=object)
In [4]: array1.shape
Out[4]: (3, 3)
Because the subarrays are all the same length, the result is (3,3), not (3,); the object
dtype didn't change that.
The other is (3,), since the subarrays differ in shape:
In [5]: array2
Out[5]: array([array([9, 8]), array([0]), array([12])], dtype=object)
In [6]: array2.shape
Out[6]: (3,)
Lets make lists instead of arrays:
In [7]: list1 = [np.array([1, 2, 3]), np.array([4, 5, 6]), np.array([7, 8, 9])]
...: list2 = [np.array([9, 8]), np.array([0]), np.array([12])]
Now we can do a list comprehension, joining each pair of subarrays:
In [8]: [np.hstack((a,b)) for a,b in zip(list1, list2)]
Out[8]: [array([1, 2, 3, 9, 8]), array([4, 5, 6, 0]), array([ 7, 8, 9, 12])]
And if necessary make an array from that:
In [9]: np.array(_, object)
Out[9]:
array([array([1, 2, 3, 9, 8]), array([4, 5, 6, 0]),
array([ 7, 8, 9, 12])], dtype=object)
If I try the same thing with the original arrays, the result is similar, but different. It iterates on the rows of the 2d array;
In [10]: [np.hstack((a,b)) for a,b in zip(array1, array2)]
Out[10]:
[array([1, 2, 3, 9, 8], dtype=object),
array([4, 5, 6, 0], dtype=object),
array([7, 8, 9, 12], dtype=object)]
Or if I first convert the object dtype array1
to int
:
In [11]: [np.hstack((a,b)) for a,b in zip(array1.astype(int), array2)]
Out[11]: [array([1, 2, 3, 9, 8]), array([4, 5, 6, 0]), array([ 7, 8, 9, 12])]
object dtype arrays are little more than glorified (or debased?) lists - they contrain pointers to objects stored elsewhere in memory. So access is basically same as with a list comprehension, even a bit slower.