Search code examples
deep-learningnlppytorchhuggingface-transformers

How to use architecture of T5 without pretrained model (Hugging face)


I would like to study the effect of pre-trained model, so I want to test t5 model with and without pre-trained weights. Using pre-trained weights is straight forward, but I cannot figure out how to use the architecture of T5 from hugging face without the weights. I am using Hugging face with pytorch but open for different solution.


Solution

  • https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Model

    "Initializing with a config file does not load the weights associated with the model, only the configuration."

    for without weights create a T5Model with config file

    from transformers import AutoConfig    
    from transformers import T5Tokenizer, T5Model    
    model_name = "t5-small"    
    config = AutoConfig.from_pretrained(model_name)    
    tokenizer = T5Tokenizer.from_pretrained(model_name)    
    model = T5Model.from_pretrained(model_name)    
    model_raw = T5Model(config)