Search code examples
pythonmachine-learningkeraslstmkeras-layer

Using Subtract layer in Keras


I'm implementing in Keras the LSTM architecture described here. I think I am really close, though I still have a problem with the combination of the shared and language-specific layers. Here is the formula (approximately): y = g * y^s + (1 - g) * y^u

And here is the code I tried:

### Linear Layers ###
univ_linear = Dense(50, activation=None, name='univ_linear')
univ_linear_en = univ_linear(en_encoded)
univ_linear_es = univ_linear(es_encoded)
print(univ_linear_en)

# Gate >> g
gate_en = Dense(50, activation='sigmoid', name='gate_en')(en_encoded)
gate_es = Dense(50, activation='sigmoid', name='gate_es')(es_encoded)
print(gate_en)
print(gate_es)

# EN >> y^s
spec_linear_en = Dense(50, activation=None, name='spec_linear_en') (en_encoded)
print(spec_linear_en)

# g * y^s
gated_spec_linear_en = Multiply()([gate_en, spec_linear_en])
print(gated_spec_linear_en)

# ES >> y^s
spec_linear_es = Dense(50, activation=None, name='spec_linear_es')(es_encoded)
print(spec_linear_es)

# g * y^s
gated_spec_linear_es = Multiply()([gate_es, spec_linear_es])
print(gated_spec_linear_es)

# 1 - Gate >> (1 - g)
only_ones_en = K.ones(gate_en.shape)
univ_gate_en = Subtract()([only_ones_en, gate_en])
print(univ_gate_en)

only_ones_es = K.ones(gate_es.shape)
univ_gate_es = Subtract()([only_ones_es, gate_es])
print(univ_gate_es)

# (1 - g) * y^u
gated_univ_linear_en = Multiply()([univ_gate_en, univ_linear_en])
print(gated_univ_linear_en)
gated_univ_linear_es = Multiply()([univ_gate_es, univ_linear_es])
print(gated_univ_linear_es)

out_en = Add()([gated_spec_linear_en, gated_univ_linear_en])
print(out_en)

out_es = Add()([gated_spec_linear_es, gated_univ_linear_es])
print(out_es)

When I compile my model, I got this error:

AttributeError: 'NoneType' object has no attribute '_inbound_nodes'

Though, my model compiles without error when I replace (1 - g) * y^u by g * y^u:

# (1 - g) * y^u
gated_univ_linear_en = Multiply()([gate_en, univ_linear_en])
print(gated_univ_linear_en)
gated_univ_linear_es = Multiply()([gate_es, univ_linear_es])
print(gated_univ_linear_es)

Consequently, I think the problem comes from the code under the comment # 1 - Gate >> (1 - g), and more precisely from the subtraction (1 - g).

Does anyone have any clue about what exactly the problem is and how I can solve it?


Solution

  • The input of a Keras layer must be Keras Tensors which are the output of previous layers. When you write only_ones_en = K.ones(gate_en.shape), then only_ones_en would not be a Keras Tensor, rather it would be a Tensor of the backend (e.g. TensorFlow Tensor).

    As for your specific example, you can do this much more easily using a Lambda layer:

    univ_gate_en = Lambda(lambda x: 1. - x)(gate_en)
    

    Or maybe in a less efficient way:

    univ_gate_en = Lambda(lambda x: K.ones_like(x) - x)(gate_en)
    

    Or in a much more verbose and maybe less efficient way:

    only_ones_en = Lambda(lambda x: K.ones_like(x))(gate_en)
    univ_gate_en = Subtract()([only_ones_en, gate_en])
    

    The same thing applies to other places where you have used K.* as input of a layer.