Gradient error occurred when calculate two embeddings on eager mode

When I tried to rewrite a dynet project with tensorflow on eager mode, the following error occurred:

tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute ConcatV2 as input #1 was expected to be a float tensor but is a int32 tensor [Op:ConcatV2] name: concat

I tried to locate the error and simplify the code, then found when two embeddings are calculated in one dynamic graph on eager mode, the error will occurred.

There is no error when two embeddings are added on static graph mode.

with tf.Graph().as_default():
    emb = tf.keras.layers.Embedding(10000, 50)
    emb2 = tf.keras.layers.Embedding(10000, 50)
    y_ = emb(tf.constant(100)) + emb2(tf.constant(100))
    y = tf.ones((1, 50))
    loss = tf.reduce_sum(y - y_)
    optimizer = tf.train.MomentumOptimizer(0.2,0.5).minimize(loss)
    init = tf.global_variables_initializer()
    with tf.Session() as sess:
        sess.run(init)
       sess.run(fetches=[loss, optimizer])

But when I run the following code on eager mode, the error occurred.

tfe.enable_eager_execution()

def loss(y):
    emb = tf.keras.layers.Embedding(10000,50)
    emb2 = tf.keras.layers.Embedding(10000,50)
    y_ = emb(tf.constant(100)) + emb2(tf.constant(100))
    return tf.reduce_sum(y - y_)

y = tf.ones((1, 50))
grads = tfe.implicit_gradients(loss)(y)
tf.train.MomentumOptimizer(0.2, 0.5).apply_gradients(grads)

What's wrong with the code on eager mode, and how can I calculate two embedding on eager mode?

Solution

Two things going on here:

I think this is a bug introduced with eager execution, I've filed https://github.com/tensorflow/tensorflow/issues/18180 for that. I don't think this exists in release 1.6, so perhaps you could try with that in the interim.
That said, I noticed that you're defining an Embedding layer object inside your loss function. This means that each invocation of loss is creating a new Embedding, which is probably not what you want. Instead, you'd probably want to restructure your code as:

emb = tf.keras.layers.Embedding(10000,50) emb2 = tf.keras.layers.Embedding(10000,50)

def loss(y): y_ = emb(tf.constant(100)) + emb2(tf.constant(100)) return tf.reduce_sum(y - y_)

With eager execution, parameter ownership is more "Pythonic", in that the parameters associated with the Embedding object (emb and emb2) have the lifetime of the object that created them.

Hope that helps.