Search code examples
pythonnlpspyderbert-language-model

How to get the sentence embeddings with DeBERTa.deberta.pooling?


How to get the sentence embeddings with DeBERTa.deberta.pooling?

Hi everyone, I applied a DeBERTa model to analyze sentences, this is what my code looks like:

from transformers import DebertaTokenizer, DebertaModel
import torch
# downloading the models
tokenizer = DebertaTokenizer.from_pretrained("microsoft/deberta-base")
model = DebertaModel.from_pretrained("microsoft/deberta-base")
# tokenizing the input text and converting it into pytorch tensors
inputs = tokenizer(["The cat cought the mouse", "This is the second sentence"], return_tensors="pt", padding=True)
# pass through the model 
outputs = model(**inputs)

I realize that one option to get the sentence embeddings is to look at the CLS hidden state using

outputs.last_hidden_state[:,0,:]

However, I would prefer to get the pooled output. As I learned, pooled_output is not supported, but there seems to be an implementation in DeBERTa named DeBERTa.deberta.pooling (see https://deberta.readthedocs.io/en/latest/_modules/DeBERTa/deberta/pooling.html). Does anyone know how to use it?


Solution

  • First, you need to import pooler for Deberta, and then it is better to create a separate class to make it more convenient to work.

    from transformers.models.deberta.modeling_deberta import ContextPooler
    from transformers.models.deberta.modeling_deberta import StableDropout
    from transformers import DebertaTokenizer, DebertaModel
    import torch
    
    tokenizer = DebertaTokenizer.from_pretrained("microsoft/deberta-base")
    model = DebertaModel.from_pretrained("microsoft/deberta-base")
    
    class CustomModel(nn.Module): # deberta
        def __init__(self, backbone):
            super(CustomModel, self).__init__()
            self.model = backbone
            self.config = self.model.config
            self.pooler = ContextPooler(self.config) 
            # output_dim = self.pooler.output_dim
            # self.classifier = nn.Linear(output_dim, num_classes)
            # drop_out = getattr(self.config, "cls_dropout", None)
            # drop_out = self.config.hidden_dropout_prob if drop_out is None else drop_out
            # self.dropout = StableDropout(drop_out)
       def forward('.....'):
            encoder_layer = outputs[0]
            pooled_output = self.pooler(encoder_layer)#, flat_attention_mask)
            # x = '......'
            return x # pooled_output
    model = CustomModel(model)
    

    The details depend on your task. A more precise implementation can be found in the model repository. I hope it helps. https://github.com/huggingface/transformers/blob/v4.32.0/src/transformers/models/deberta/modeling_deberta.py#L66