Search code examples
pythonslicetheano

Slicing in Theano converts a matrix into a vector


Consider the following code snippet:

import theano.tensor as T
import theano.tensor
import numpy as np

batch_shape = (50, 40, 30, 30)
batch_size = batch_shape[0]
ncols = batch_shape[1]*batch_shape[2]*batch_shape[3]
minibatch = theano.tensor.tensor4(name='minibatch', 
                                  dtype=theano.config.floatX)
xflat = minibatch.reshape((batch_size,ncols))

partition = np.array([1, 2, 3])
xsub1 = xflat[:,partition]

partition = np.array([1])
xsub2 = xflat[:,partition]

print "xsub1.type: ", xsub1.type
print "xsub2.type: ", xsub2.type

If you run it, you get the following output:

xsub1.type: TensorType(float64, matrix)
xsub2.type: TensorType(float64, col)

Apparently indexing with an array of length 1 turns xsub2 into a col instead of a matrix. How can i make xsub2 be a matrix?


Solution

  • A col or "column vector" is the name Theano uses for a symbolic matrix that it knows contains only one column. It should be possible to use it just like a matrix.

    Theano often doesn't know the shape of a particular symbolic tensor, only its dimensionality. However, in some circumstances, such as that given in the question, Theano is able to infer that a tensor has a particular special case of shape and can sometimes use this information to optimize the computation. This is why col (and row) exist as special cases of matrix.

    If you think about the shape more than the type then you'll see that Theano is behaving just the same as numpy:

    import theano
    import theano.tensor
    import numpy as np
    
    
    def compute(minibatch):
        xflat = minibatch.reshape((minibatch.shape[0], -1))
        partition = np.array([1, 2, 3])
        xsub1 = xflat[:, partition]
        partition = np.array([1])
        xsub2 = xflat[:, partition]
        return xsub1, xsub2
    
    
    def compile_theano_version():
        minibatch = theano.tensor.tensor4(name='minibatch', dtype=theano.config.floatX)
        xsub1, xsub2 = compute(minibatch)
        print xsub1.type, xsub2.type
        return theano.function([minibatch], [xsub1, xsub2])
    
    
    def numpy_version(minibatch):
        return compute(minibatch)
    
    
    def main():
        batch_shape = (50, 40, 30, 30)
        minibatch = np.random.standard_normal(size=batch_shape).astype(theano.config.floatX)
    
        xsub1, xsub2 = numpy_version(minibatch)
        print xsub1.shape, xsub2.shape
    
        theano_version = compile_theano_version()
        xsub1, xsub2 = theano_version(minibatch)
        print xsub1.shape, xsub2.shape
    
    
    main()
    

    This prints

    (50L, 3L) (50L, 1L)
    TensorType(float64, matrix) TensorType(float64, col)
    (50L, 3L) (50L, 1L)
    

    So a col is indeed a matrix with one column and not a vector.