Search code examples
performancenumpyindexingvectorizationmatrix-indexing

Creating a slice of a matrix from a vector in Numpy


Provided that I have a matrix A of size 5 by 4, also a vector b of length 5 whose element indicates how many values I need in the corresponding row of matrix A. That means each value in b is upper-bounded by the size of second dimension of A. My problem is how to make a slice of a matrix given an vector, which is a complex-version of taking an integer-valued elements of a vector by writing vector[:n]

For example, this can be implemented with a loop over A's rows:

import numpy
A=numpy.arange(20).reshape((5,4))
b=numpy.array([0, 3, 3, 2, 3])
output=A[0, :b[0]]
for i in xrange(1, A.shape[0]):
    output=numpy.concatenate((output, A[i, :b[i]]), axis=0)
# output is array([ 4,  5,  6,  8,  9, 10, 12, 13, 16, 17, 18])

The computation efficiency of this loop can be fairly low when dealing with a very large array. Furthermore, my purpose is to apply this in Theano eventually without a scan operation. I want to avoid using a loop to make a slice given an vector.


Solution

  • Another good setup for using NumPy broadcasting!

    A[b[:,None] > np.arange(A.shape[1])]
    

    Sample run

    1) Inputs :

    In [16]: A
    Out[16]: 
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11],
           [12, 13, 14, 15],
           [16, 17, 18, 19]])
    
    In [17]: b
    Out[17]: array([0, 3, 3, 2, 3])
    

    2) Use broadcasting to create mask for selection :

    In [18]: b[:,None] > np.arange(A.shape[1])
    Out[18]: 
    array([[False, False, False, False],
           [ True,  True,  True, False],
           [ True,  True,  True, False],
           [ True,  True, False, False],
           [ True,  True,  True, False]], dtype=bool)
    

    3) Finally use boolean-indexing for selecting elems off A :

    In [19]: A[b[:,None] > np.arange(A.shape[1])]
    Out[19]: array([ 4,  5,  6,  8,  9, 10, 12, 13, 16, 17, 18])