python machine-learning nlp huggingface-transformers transformer-model

Is it possible to access hugging face transformer embedding layer?

I want to use a pretrained hugging face transformer language model as an encoder in a sequence to sequence model.

The task is grammatical error correction, so both input and output come from the same language.

Therefore I was wondering if it is possible to access the embedding layer from the hugging face transformer encoder, and use it as the embedding layer for the decoder?

Or maybe there is some other approach that you'd recommend?

Solution

Taking bert as example

if you load BertModel

from transformers import BertModel
model = BertModel.from_pretrained("bert-base-uncased")
print(model.embeddings)

# output is
BertEmbeddings(
  (word_embeddings): Embedding(30522, 768, padding_idx=0)
  (position_embeddings): Embedding(512, 768)
  (token_type_embeddings): Embedding(2, 768)
  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
  (dropout): Dropout(p=0.1, inplace=False)
)

if you load bert with other layers (e.g., BertForPreTraining or BertForSequenceClassification)

from transformers import BertForSequenceClassification
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
print(model.bert.embeddings)

# output is
BertEmbeddings(
  (word_embeddings): Embedding(30522, 768, padding_idx=0)
  (position_embeddings): Embedding(512, 768)
  (token_type_embeddings): Embedding(2, 768)
  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
  (dropout): Dropout(p=0.1, inplace=False)
)