How to get the sentence embeddings with DeBERTa.deberta.pooling?
Hi everyone, I applied a DeBERTa model to analyze sentences, this is what my code looks like:
from transformers import DebertaTokenizer, DebertaModel
import torch
# downloading the models
tokenizer = DebertaTokenizer.from_pretrained("microsoft/deberta-base")
model = DebertaModel.from_pretrained("microsoft/deberta-base")
# tokenizing the input text and converting it into pytorch tensors
inputs = tokenizer(["The cat cought the mouse", "This is the second sentence"], return_tensors="pt", padding=True)
# pass through the model
outputs = model(**inputs)
I realize that one option to get the sentence embeddings is to look at the CLS hidden state using
outputs.last_hidden_state[:,0,:]
However, I would prefer to get the pooled output. As I learned, pooled_output
is not supported, but there seems to be an implementation in DeBERTa named DeBERTa.deberta.pooling (see https://deberta.readthedocs.io/en/latest/_modules/DeBERTa/deberta/pooling.html). Does anyone know how to use it?
First, you need to import pooler for Deberta, and then it is better to create a separate class to make it more convenient to work.
from transformers.models.deberta.modeling_deberta import ContextPooler
from transformers.models.deberta.modeling_deberta import StableDropout
from transformers import DebertaTokenizer, DebertaModel
import torch
tokenizer = DebertaTokenizer.from_pretrained("microsoft/deberta-base")
model = DebertaModel.from_pretrained("microsoft/deberta-base")
class CustomModel(nn.Module): # deberta
def __init__(self, backbone):
super(CustomModel, self).__init__()
self.model = backbone
self.config = self.model.config
self.pooler = ContextPooler(self.config)
# output_dim = self.pooler.output_dim
# self.classifier = nn.Linear(output_dim, num_classes)
# drop_out = getattr(self.config, "cls_dropout", None)
# drop_out = self.config.hidden_dropout_prob if drop_out is None else drop_out
# self.dropout = StableDropout(drop_out)
def forward('.....'):
encoder_layer = outputs[0]
pooled_output = self.pooler(encoder_layer)#, flat_attention_mask)
# x = '......'
return x # pooled_output
model = CustomModel(model)
The details depend on your task. A more precise implementation can be found in the model repository. I hope it helps. https://github.com/huggingface/transformers/blob/v4.32.0/src/transformers/models/deberta/modeling_deberta.py#L66