We noticed that the mixed usage of fancy indexing and slicing is so confusing and undocumented for multi-dimensional arrays, for example:
In [114]: x = np.arange(720).reshape((2,3,4,5,6))
In [115]: x[:,:,:,0,[0,1,2,4,5]].shape
Out[115]: (2, 3, 4, 5)
In [116]: x[:,:,0,:,[0,1,2,4,5]].shape
Out[116]: (5, 2, 3, 5)
I have read the usage of fancy indexing on https://numpy.org/doc/stable/user/basics.indexing.html and I can understand that x[:,0,:,[1,2]] = [x[:,0,:,1], x[:,0,:,2]]
. However I cannot understand why the result for above Input [115]
and Input [116]
differ on the first dimension. Can someone point to where such broadcasting rules are documented?
Thanks!
I have tried searching the documentation for fancy indexing as well as posting issues to the numpy repo on Github.
Some additional insight into why there is ambiguity:
In the latter case in the question, the 3rd and 5th axes are indexed, and thus disappear from the new array. A new axis (with shape equal to the broadcasting of the indices) has to be added somewhere. If I was numpy, and had to insert a shape (5,)
array into the array with "shape" (2, 3, -, 5, -)
, would I place it in place of the first missing dimension? Or the second?
Exactly because a slice separates two advanced indices, numpy can not replace a consecutive set of axes, and thus not know whether to insert the new axis before or after the separating slice(s). As a result, the new axis is inserted at the front:
(5, 2, 3, 5)
^ ^^^^^^^--- old dimensions
|
new dimension
Only in the first case, where the disappearing axes are all adjacent ("shape" (2, 3, 4, -, -)
), can the new axes be unambiguously inserted at the end.
Note: Behind the scenes numpy always inserts the new axes at the start. It just (mostly for convenience I believe) transposes the array to move the new axes into place when unambiguous.
Also interesing is Numpy Enhancement Proposal 21