I am trying to access the output embeddings from several different layers of the pretrained "DistilBERT" model. ("distilbert-base-uncased")
bert_output = model(input_ids, attention_mask=attention_mask)
The bert_output seems to return only the embedding values of the last layer for the input tokens.
If you want to get the output of all the hidden layers, you need to add the output_hidden_states=True
kwarg to your config.
Your code will look something like
from transformers import DistilBertModel, DistilBertConfig
config = DistilBertConfig.from_pretrained('distilbert-base-cased', output_hidden_states=True)
model = DistilBertModel.from_pretrained('distilbert-base-cased', config=config)
The hidden layers will be made available as bert_output[2]