Search code examples
pythonnumpyarray-indexing

numpy indexing: shouldn't trailing Ellipsis be redundant?


While trying to properly understand numpy indexing rules I stumbled across the following. I used to think that a trailing Ellipsis in an index does nothing. Trivial isn't it? Except, it's not actually true:

Python 3.5.2 (default, Nov 11 2016, 04:18:53) 
[GCC 4.8.5] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> 
>>> D2 = np.arange(4).reshape((2, 2))
>>>
>>> D2[[1, 0]].shape; D2[[1, 0], ...].shape
(2, 2)
(2, 2)
>>> D2[:, [1, 0]].shape; D2[:, [1, 0], ...].shape
(2, 2)
(2, 2)
>>> # so far so expected; now
... 
>>> D2[[[1, 0]]].shape; D2[[[1, 0]], ...].shape
(2, 2)
(1, 2, 2)
>>> # ouch!
...
>>> D2[:, [[1, 0]]].shape; D2[:, [[1, 0]], ...].shape
(2, 1, 2)
(2, 1, 2)

Now could someone in the know advise me as to whether this is a bug or a feature? And if the latter, what's the rationale?

Thanks in advance, Paul


Solution

  • Evidently there's some ambiguity in the interpretation of the [[1, 0]] index. Possibly the same thing discussed here:

    Advanced slicing when passed list instead of tuple in numpy

    I'll try a different array, to see if it makes things any clear

    In [312]: D2=np.array([[0,0],[1,1],[2,2]])
    In [313]: D2
    Out[313]: 
    array([[0, 0],
           [1, 1],
           [2, 2]])
    
    In [316]: D2[[[1,0,0]]]
    Out[316]: 
    array([[1, 1],
           [0, 0],
           [0, 0]])
    In [317]: _.shape
    Out[317]: (3, 2)
    

    Use of : or ... or making the index list an array, all treat it as a (1,3) index, and expand the dimensions of the result accordingly

    In [318]: D2[[[1,0,0]],:]
    Out[318]: 
    array([[[1, 1],
            [0, 0],
            [0, 0]]])
    In [319]: _.shape
    Out[319]: (1, 3, 2)
    In [320]: D2[np.array([[1,0,0]])]
    Out[320]: 
    array([[[1, 1],
            [0, 0],
            [0, 0]]])
    In [321]: _.shape
    Out[321]: (1, 3, 2)
    

    Note that if I apply transpose to the indexing array I get a (3,1,2) result

    In [323]: D2[np.array([[1,0,0]]).T,:]
    ...
    In [324]: _.shape
    Out[324]: (3, 1, 2)
    

    Without : or ..., it appears to strip off one layer of [] before applying it to the 1st axis:

    In [330]: D2[[1,0,0]].shape
    Out[330]: (3, 2)
    In [331]: D2[[[1,0,0]]].shape
    Out[331]: (3, 2)
    In [333]: D2[[[[1,0,0]]]].shape
    Out[333]: (1, 3, 2)
    In [334]: D2[[[[[1,0,0]]]]].shape
    Out[334]: (1, 1, 3, 2)
    In [335]: D2[np.array([[[[1,0,0]]]])].shape
    Out[335]: (1, 1, 1, 3, 2)
    

    I think there's a backward compatibility issue here. We know that the tuple layer is 'redundant': D2[(1,2)] is the same as D2[1,2]. But for compatibility for early versions of numpy (numeric) that first [] layer may be treated in the same way.

    In that November question, I noted:

    So at a top level a list and tuple are treated the same - if the list can't interpreted as an advanced indexing list.

    The addition of a ... is another way of separating the D2[[[0,1]]] from D2[([0,1],)].

    From @eric/s pull request seburg explains

     The tuple normalization is a rather small thing (it basically checks for a non-array sequence of length <= np.MAXDIMS, and if it contains another sequence, slice or None consider it a tuple).

    [[1,2]] is a 1 element list with a list, so it is considered a tuple, i.e. ([1,2],). [[1,2]],... is a tuple already.