python matplotlib gradient bert-language-model

What do you need for plotting the outcome of a question-answering model

I have been working on a question answering model, where I receive answers on my questions by my word embedding model BERT. But I really want to plot something like this:

But the problem is, I don't really know how. I am really stuck at this quest. I don't know how to represent a part of the context in a plot. I do have two variables, named answer_start and answer_end which indicates in what part in the context the model got its answers from. Can someone please help me out with this and tell me what variables I need to put in my pyplot?

Below my code:

from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch
import numpy as np
import pandas as pd

max_seq_length = 512

tokenizer = AutoTokenizer.from_pretrained("henryk/bert-base-multilingual-cased-finetuned-dutch-squad2")
model = AutoModelForQuestionAnswering.from_pretrained("henryk/bert-base-multilingual-cased-finetuned-dutch-squad2")

questions = [
    "Welke soorten gladiatoren waren er?",
    "Wat is een provocator?"
]
for question in questions: # voor elke question moet er door alle lines geiterate worden
    print(f"Question: {question}")
    f = open("test.txt", "r")
    for line in f:
      text = str(line) #het antwoord moet een string zijn
      #encoding met tokenizen van de zinnen
      inputs = tokenizer.encode_plus(question,
                                     text,
                                     add_special_tokens=True,
                                     max_length=max_seq_length,
                                     truncation=True,
                                     return_tensors="pt")
      input_ids = inputs["input_ids"].tolist()[0]

  

      #ff uitzoeken wat die ** deed
      answer_start_scores, answer_end_scores = model(**inputs, return_dict=False)

      answer_start = torch.argmax(
          answer_start_scores
          )  # Het antwoord met de hoogste argmax accuracy vanaf het begin woord
      answer_end = torch.argmax(
          answer_end_scores) + 1  # Zelfde maar dan eind woord
      answer = tokenizer.convert_tokens_to_string(
          tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))

      #om het antwoorden [cls] en NaN te voorkomen    
      if answer == '[CLS]':
        continue
      elif answer == '':
        continue
      else:
        print(f"Answer: {answer}")
        print(f"Answer start: {answer_start}")
        print(f"Answer end: {answer_end}") 
      f.seek(0)
      break          
    # f.seek(0)
    # break
  
f.close()

Also the output:

> Question: Welke soorten gladiatoren waren er?
> Answer: de thraex, de retiarius en de murmillo
> Answer start: 24
> Answer end: 37
> Question: Wat is een provocator?
> Answer: telemachus
> Answer start: 87
> Answer end: 90

Solution

I don't know if I understand what your problem is. But to make a plot similar to that of the figure, I would do something like this:

import matplotlib.pyplot as plt; plt.rcdefaults()
import numpy as np
import matplotlib.pyplot as plt

sentence = ('list' 'of' 'words' 'that' 'make' 'up' 'the' 'sentence' 'in' 'which' 'the' 'answer' 'is' 'found')
y_pos = np.arange(len(sentence))
probability = [0.1, 0.2, 0.1, 0.8, 0.6] 

plt.bar(y_pos, probability, align='center', alpha=0.5)
plt.xticks(y_pos, sentence)
plt.ylabel('Answer probability')
plt.title('Words of the sentence')

plt.show()

So assuming that the answer lies within a larger sentence/paragraph, what I would do is insert all the words of the sentence/paragraph into the x axis of a bar plot (variable sentence - text.txt I suppose), while on the y axis the percentage indicating the probability that a particular word is the beginning or ending word of the answer (variable probability). Obviously the two variables sentence and probability will have the same length, where the first sentence variable corresponds to the first probability value and so on.

For instance answer_start_scores and answer_end_scores will be the words with the highest score, therefore their "bar" of the bar plot will be the highest (highest value in the list of probability).

Finally in answer_start_scores and answer_end_scores you should have all the scores for which the starting and ending word is most likely.

EDIT: Maybe, you could also make two separate bar plots for the initial word of the answer and the final word and then join them together by adding the percentages.