Search code examples
numpymatrixvectorizationtheano

Theano row/column wise subtraction


X is an n by d matrix, W is an m by d matrix, for every row in X I want to compute the squared Euclidean distance with every row in W, so the results will be an n by m matrix.

If there's only one row in W, this is easy

x = tensor.TensorType("float64", [False, False])()
w = tensor.TensorType("float64", [False])()
z = tensor.sum((x-w)**2, axis=1)
fn = theano.function([x, w], z)
print fn([[1,2,3], [2,2,2]], [2,2,2])
# [ 2.  0.]

What do I do when W is a matrix (in Theano)?


Solution

  • Short answer, use scipy.spatial.distance.cdist

    Long answer, if you don't have scipy, is to broadcast subtract and then norm by axis 0.

    np.linalg.norm(X[:,:,None]-W[:,None,:], axis=0)

    Really long answer, of you have an ancient version of numpy without a vecorizable linalg.norm (i.e. you're using Abaqus) is

    np.sum((X[:,:,None]-W[:,None,:])**2, axis=0).__pow__(0.5)

    Edit by OP
    In Theano we can make X and W both 3d matrices and make the corresponding axes broadcastable like

    x = tensor.TensorType("float64", [False, True, False])()
    w = tensor.TensorType("float64", [True, False, False])()
    
    z = tensor.sum((x-w)**2, axis=2)
    
    fn = theano.function([x, w], z)
    print fn([[[0,1,2]], [[1,2,3]]], [[[1,1,1], [2,2,2]]])
    # [[ 2.  5.]
    #  [ 5.  2.]]