Search code examples
pythonnumpymultidimensional-arraymatrix-indexing

Numpy advanced selection not working


Can someone please help me to understand why sometimes the advanced selection doesn't work and what I can do to get it to work (2nd case)?

>>> import numpy as np
>>> b = np.random.rand(5, 14, 3, 2)

# advanced selection works as expected
>>> b[[0,1],[0,1]]
array([[[ 0.7575555 ,  0.18989068],
        [ 0.06816789,  0.95760398],
        [ 0.88358107,  0.19558106]],

       [[ 0.62122898,  0.95066355],
        [ 0.62947885,  0.00297711],
        [ 0.70292323,  0.2109297 ]]])

# doesn't work - why?
>>> b[[0,1],[0,1,2]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape

# but this seems to work
>>> b[:,[0,1,2]]
array([[[[  7.57555496e-01,   1.89890676e-01],
         [  6.81678915e-02,   9.57603975e-01],
         [  8.83581071e-01,   1.95581063e-01]],

        [[  2.24896112e-01,   4.77818599e-01],
         [  4.29313861e-02,   8.61578045e-02],
         [  4.80092364e-01,   3.66821618e-01]],
...

Update

Breaking up the selection seems to resolve the problem, but I am unsure why this is necessary (or if there's a better way to achieve this).

>>> b.shape
(5, 14, 3, 2)
>>> b[[0,1]].shape
(2, 14, 3, 2)

# trying to separate indexing by dimension.
>>> b[[0,1]][:,[0,1,2]]
array([[[[ 0.7575555 ,  0.18989068],
         [ 0.06816789,  0.95760398],
         [ 0.88358107,  0.19558106]],

        [[ 0.22489611,  0.4778186 ],
         [ 0.04293139,  0.0861578 ],

Solution

  • You want

    b[np.ix_([0, 1], [0, 1, 2])]
    

    You also need to do the same thing for b[[0, 1], [0, 1]], because that's not actually doing what you think it is:

    b[np.ix_([0, 1], [0, 1])]
    

    The problem here is that advanced indexing does something completely different from what you think it does. You've made the mistake of thinking that b[[0, 1], [0, 1, 2]] means "take all parts b[i, j] of b where i is 0 or 1 and j is 0, 1, or 2". This is a reasonable mistake to make, considering that it seems to work that way when you have one list in the indexing expression, like

    b[:, [1, 3, 5], 2]
    

    In fact, for an array A and one-dimensional integer arrays I and J, A[I, J] is an array where

    A[I, J][n] == A[I[n], J[n]]
    

    This generalizes in the natural way to more index arrays, so for example

    A[I, J, K][n] == A[I[n], J[n], K[n]]
    

    and to higher-dimensional index arrays, so if I and J are two-dimensional, then

    A[I, J][m, n] == A[I[m, n], J[m, n]]
    

    It also applies the broadcasting rules to the index arrays, and converts lists in the indexes to arrays. This is much more powerful than what you expected to happen, but it means that to do what you were trying to do, you need something like

    b[[[0],
       [1]], [[0, 1, 2]]]
    

    np.ix_ is a helper that will do that for you so you don't have to write a dozen brackets.