Helping out a Newbie with Linear Algebra in TensorFlow (Rank 3 Tensors)

I suspect that this has already been asked, although in my searching, many of the other problems had specific unique issues that don't seem as applicable to my situation (or maybe the solutions were over my head).

I have a standard feed forward neural net in tensorflow, which behaves correctly with a rank2 input tensor of size [None, n_features], weights of [n_features, n_neurons] which results in a hidden layer of tf.matmul(inputs, weight) = [None, n_neurons].

However, I would like to expand my dimensionality by one dimension both in the input and the outputs. For example, I want to have

inputs = tf.placeholder("float", shape=[None, n_type, n_features])
weight= tf.Variable(FNN_weight_initializer([n_type, n_features, n_neurons]))
Hidden1 = tf.matmul(inputs, weight)

And my end goal here is to have Hidden1 = [None, n_type, n_neurons].

However instead of generating the desired tensor shape, I instead get a tensor of shape [n_type, n_type, n_neurons]. I'm not an expert at linear algebra, and I've tried a few combinations of dimension order with no success. Is it even possible to multiply rank3 tensors with tf.matmul? Should I be doing a reshape or transpose operation somewhere here?

Solution

EDIT according to OP's comment

You could flatten the input feature vectors to shape [-1, n_type * n_features], apply a well-chosen matrix multiplication and reshape your output from [-1, n_type * n_neurons] to [-1, n_type, n_neurons]

The operation tensor would be a block-diagonal [n_type * n_features, n_type * n_neurons] one, each block being one of the n_type tensors in weights.

To build a block-diagonal matrix, I used another answer (from here)

This would look like

inputs = tf.placeholder("float", shape=[None, n_type, n_features])
inputs = tf.reshape(inputs, shape=[-1, n_type * n_features])

weights = tf.Variable(FNN_weight_initializer([n_type, n_features, n_neurons]))

split_weights = tf.split(weights, num_or_size_splits=n_type, axis=1)
# each element of split_weights is a tensor of shape : [1, n_features, n_neurons] -> need to squeeze
split_weights = tf.map_fn(lambda elt : tf.squeeze(elt, axis=0), split_weights)

block_matrix = block_diagonal(split_weights) # from the abovementioned reference

Hidden1 = tf.matmul(inputs, block_matrix)
# shape : [None, n_type * n_neurons]

Hidden1 = tf.reshape(Hidden1, [-1, n_type, n_neurons])
# shape : [None, n_type, n_neurons]

Orignal answer

According to the documentation of tf.matmul (reference), the tensors you are multiplying need to be of the same rank.

When the rank is >2, only the last two dimensions need to be matrix multiplication compatible, the first other dimensions need to be exactly matching.

So, to the question "Is it even possible to multiply rank3 tensors with tf.matmul?", the answer is "Yes, it is possible, but conceptually, it is still rank 2 multiplication".

Therefore, some reshaping is necessary:

inputs = tf.placeholder("float", shape=[None, n_type, n_features])

inputs = tf.reshape(inputs, shape=[-1, n_type, 1, n_features])

weights = tf.Variable(FNN_weight_initializer([n_type, n_features, n_neurons]))

weights = tf.expand_dims(weights, 0)
# shape : [1, n_type, n_features, n_neurons]

weights = tf.tile(weights, [tf.shape(inputs)[0], 1, 1, 1])
# shape : [None, n_type, n_features, n_neurons]

Hidden1 = tf.matmul(inputs, weights)
# shape : [None, n_type, 1, n_neurons]

Hidden1 = tf.reshape(Hidden1, [-1, n_type, n_neurons])
# shape : [None, n_type, n_neurons]