Search code examples
pythonpython-3.xhuggingface-transformershuggingface

does hugging face model.generate for flan-T5 default is summarization?


Given the following code. why does the function: model.generate() returns a summary, where does it order to do summary and not some other task? where can I see the documentation for that as well.

model_name = ‘google/flan-t5-base’
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
dataset_name = “knkarthick/dialogsum”
dataset = load_dataset(dataset_name)

for i in example_indices:
  dialog = dataset[‘test’][i][‘dialogue’]
  input = tokenizer(dialog,sentence,return_tensors=‘pt’)

  ground_truth = dataset[‘test’][i][‘summary’]

  model_summary = model.generate(input[‘input_ids’],max_new_tokens=50)
  summary = tokenizer.decode(model_summary[0],skip_special_tokens=True)
  print(summary)

Solution

  • Well, it's all in the dataset:

    dataset_name = “knkarthick/dialogsum”
    

    DialogSum: A Real-life Scenario Dialogue Summarization Dataset
    DialogSum is a large-scale dialogue summarization dataset, consisting of 13,460 dialogues with corresponding manually labeled summaries and topics.

    Transformer based models like T5, which you are using, are not explicitly told what to do at the time of inference. They learn to map from an input sequence to an output sequence. During training, the model was frequently exposed to a certain pattern (input: dialog, output: summary). Now when you provide it with a similar input during inference, it is likely to produce a similar output.

    So to summarize (no pun intended), this isn't any default behaviour for model.generate. It's just how your training dataset is used.