Search code examples
pythonmachine-learningnlphuggingface-transformerstransformer-model

Is it possible to access hugging face transformer embedding layer?


I want to use a pretrained hugging face transformer language model as an encoder in a sequence to sequence model.

The task is grammatical error correction, so both input and output come from the same language.

Therefore I was wondering if it is possible to access the embedding layer from the hugging face transformer encoder, and use it as the embedding layer for the decoder?

Or maybe there is some other approach that you'd recommend?


Solution

  • Taking bert as example

    if you load BertModel

    from transformers import BertModel
    model = BertModel.from_pretrained("bert-base-uncased")
    print(model.embeddings)
    
    # output is
    BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    

    if you load bert with other layers (e.g., BertForPreTraining or BertForSequenceClassification)

    from transformers import BertForSequenceClassification
    model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
    print(model.bert.embeddings)
    
    # output is
    BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )