Search code examples

Embedding inside the model vs outside the model

What is the difference between using the embedding layer inside the model and outside the model? I can build the embedding layer into the model:

model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(input_dim=1000, output_dim=64, input_length=10))
..., target ...)

I can also use embdedding outside the model to generate embedded data and then feed it into the model:

embedding_encoder = tf.keras.layers.Embedding(input_dim=1000, output_dim=64)
embedded_features = embedding_encoder(features)
..., target ...)

Does it mean that if I use embedding outside the model, embedding parameters are not learned during the training?


  • Does it mean that if I use embedding outside the model, embedding parameters are not learned during the training?

    The dense vector representations assigned from an Embedding layer are generally only trainable when setting trainable=True. It's entirely up to you how you want to preprocess your data yourself and how much you want to leave to the Embedding layer. Usually, if you are working on a NLP task, you can add a StringLookup or TextVectorization layer prior to adding an Embedding layer that allows you to preprocess your texts and train them in an elegant way without any "manual" steps.


    Each integer value fed to an Embedding layer is mapped to a unique N-dimensional vector representation, where N is chosen by you. These vector representations are, by default, drawn from a uniform distribution. The Embedding layer inherits from tf.keras.layers.Layer which contains the trainable parameter.

    I think it could make sense to generate embedding data outside your model if you are, for example, using pretrained context-sensitive vectors and you do not want to update their values during training. But again, it’s all up to you.