Search code examples
nlptraining-datahuggingface-transformerssummarization

How to view the changes in a huggingface model after training?


I trained a BART model (facebook-cnn) for summarization and compared summaries with a pretrained model

model_before_tuning_1 = AutoModelForSeq2SeqLM.from_pretrained(model_name)

trainer = Seq2SeqTrainer(
model=model,
args=training_args,
data_collator=data_collator,
train_dataset=train_data,
eval_dataset=validation_data,
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
trainer.train()

Summaries from model() and model_before_tuning_1() are different but when i compare the model config and/or print(model) it gives exact same things for both.

How to know, what exact parameters have this training changed?


Solution

  • You can compare state_dict of the models. I.e. model.state_dict() and model_before_tuning_1.state_dict().

    State_dict contains learnable parameters that change during traning. For further details see: https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html

    Otherwise, printing the models or model config gives you the same results because the architecure does not change during training.