Search code examples
pythonnumpynumpy-ndarrayindex-error

Weird "too many indices for array" error in python


Let's create a large np array 'a' with 10,000 entries

import numpy as np
a = np.arange(0, 10000)

Let's slice the array with 'n' indices 0->9, 1->10, 2->11, etc.

n = 32
b = list(map(lambda x:np.arange(x, x+10), np.arange(0, n)))
c = a[b]

The weird thing that I am getting, is that if n is smaller than 32, I get an error "IndexError: too many indices for array". If n is bigger or equal than 32, then the code works perfectly. The error occurs regardless of the size of the initial array, or the size of the individual slices, but always with number 32. Note that if n == 1, the code works.

Any idea on what is causing this? Thank you.


Solution

  • Your b is a list of arrays:

    In [84]: b = list(map(lambda x:np.arange(x, x+10), np.arange(0, 5)))            
    In [85]: b                                                                      
    Out[85]: 
    [array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
     array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]),
     array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11]),
     array([ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),
     array([ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13])]
    

    When used as an index:

    In [86]: np.arange(1000)[b]                                                     
    /usr/local/bin/ipython3:1: FutureWarning: Using a non-tuple sequence for multidimensional 
    indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. 
    In the future this will be interpreted as an array index, `arr[np.array(seq)]`, 
    which will result either in an error or a different result.
      #!/usr/bin/python3
    ---------------------------------------------------------------
    IndexError: too many indices for array
    

    A[1,2,3] is the same as A[(1,2,3)] - that is, the comma separated indices are a tuple, which is then passed on to the indexing function. Or to put it another way, a multidimensional index should be a tuple (that includes ones with slices).

    Up to now numpy has been a bit sloppy, and allowed us to use a list of indices in the same way. The warning tells us that the developers are in the process of tightening up those restrictions.

    The error means it is trying to interpret each array in your list as the index for a separate dimension. An array can have at most 32 dimensions. Evidently for the longer list it doesn't try to treat it as a tuple, and instead creates a 2d array for indexing.

    There are various ways we can use your b to index a 1d array:

    In [87]: np.arange(1000)[np.hstack(b)]                                          
    Out[87]: 
    array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  1,  2,  3,  4,  5,  6,  7,
            8,  9, 10,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11,  3,  4,  5,  6,
            7,  8,  9, 10, 11, 12,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13])
    
    In [89]: np.arange(1000)[np.array(b)]    # or np.vstack(b)                                       
    Out[89]: 
    array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
           [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
           [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
           [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
           [ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13]])
    
    In [90]: np.arange(1000)[b,]             # 1d tuple containing b                                       
    Out[90]: 
    array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
           [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
           [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
           [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
           [ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13]])
    

    Note that if b is a ragged list - one or more of the arrays is shorter, only the hstack version works.