Search code examples
pythonnlphuggingface-transformerssentiment-analysis

aspect sentiment analysis using Hugging face


I am new to transformers models and trying to extract aspect and sentiment for a sentence but having issues

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "yangheng/deberta-v3-base-absa-v1.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
text = "The food was great but the service was terrible."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)


I am able to get the tensor what I need is the output to extract the aspect and sentiment for the overall sentence

I tried this however getting error

sentiment_scores = outputs.logits.softmax(dim=1)
aspect_scores = sentiment_scores[:, 1:-1]

aspects = [tokenizer.decode([x]) for x in inputs["input_ids"].squeeze()][1:-1]
sentiments = ['Positive' if score > 0.5 else 'Negative' for score in aspect_scores.squeeze()]

for aspect, sentiment in zip(aspects, sentiments):
    print(f"{aspect}: {sentiment}")

I am looking for below o/p or similar o/p

I am unable to write the logic as to how extract aspect and sentiment

text -The food was great but the service was terrible

aspect- food ,sentiment positive
aspect - service, sentiment negative


or at overall level

aspect - food, sentiment positive


Solution

  • The model you are trying to use predicts the sentiment for a given aspect based on a text. That means, it requires text and aspect to perform a prediction. It was not trained to extract aspects from a text. You could use a keyword extraction model to extract aspects (compare this SO answer).

    import torch
    import torch.nn.functional as F
    from transformers import AutoTokenizer, AutoModelForSequenceClassification
    
    model_name = "yangheng/deberta-v3-base-absa-v1.1"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    
    aspects = ["food", "service"]
    text = "The food was great but the service was terrible."
    sentiment_aspect = {}
    for aspect in aspects:
      inputs = tokenizer(text, aspect, return_tensors="pt")
    
      with torch.inference_mode():
        outputs = model(**inputs)
    
      scores = F.softmax(outputs.logits[0], dim=-1)
      label_id = torch.argmax(scores).item()
      sentiment_aspect[aspect] = (model.config.id2label[label_id], scores[label_id].item())
    
    print(sentiment_aspect)
    

    Output:

    {'food': ('Positive', 0.9973154664039612), 'service': ('Negative', 0.9935430288314819)}