pytorch huggingface-transformers huggingface

Change the Number of layers of a Pretrained Huggingface Pegasus model used for Conditional Generation

I am trying to change the number of layers in a pretrained Huggingface Pegasus Model to see if the performance of the is improving or not. I tried updating the config function. But it generates index out of range error.

Following is the code I tried.

from transformers import PegasusConfig

config = PegasusConfig(
    encoder_layers = 14,
    encoder_attention_heads = 16,
    decoder_layers = 14,
    decoder_attention_heads = 16,
    max_position_embeddings= 2048,
)

from transformers import pipeline, PegasusTokenizer, PegasusForConditionalGeneration

model = PegasusForConditionalGeneration.from_pretrained('google/pegasus-pubmed',config = config, ignore_mismatched_sizes=True)

Solution

From what I understand, you are trying to use a pretrained model from HuggingFace for inference. This model contains different layers (16 encoder layers, 16 decoder layers by default for the pretrained model you use).

If you want to use internal states of the model (which is equivalent to ignoring some of the final layers), you can create the model with model = PegasusForConditionalGeneration.from_pretrained('google/pegasus-pubmed') and then use the parameter output_hidden_states=True of the model inference call, and use the embedding from any internal layer. But you cannot bypass only intermediate layers, since the following layers are dependant on it.

If you want to change the structure of the network by adding layers, you cannot use a pretrained model, since the layers you are trying to add would have random weights. So you would have to create your new model from the config, and train it on data that you have access to. You can try to initialize some of the layers of your model with weights from the pretrained one, but you won't have good results before training, since the model still has some completely random weights.