Search code examples
deep-learningpytorchhuggingface-transformershuggingface

How to make a trained Torch model Transformeres-compatible?


I have trained and saved a PyTorch model with torch.save. Now I want to load it as GPT2LMHeadModel. from_pretrained method looks for a directory, and then a HuggingFace registry, none of which exists. I simply have a serialized PyTorch model and an nn.Module class. How to I integrate them with the Transformers library? Training from scratch is not option (takes too long).


Solution

  • Root cause

    Your local path needs to be registered on the paths models' registry.

    Solution

    I suggest you to use the save_pretrained method. It registers the local directory for further use with .from_pretrained method.

    import torch
    from transformers import GPT2LMHeadModel, GPT2Config
    
    your_model = torch.load('path/to/your/model.pth')
    state_dict = your_model.state_dict()
    config = GPT2Config.from_pretrained('gpt2')
    model = GPT2LMHeadModel(config)
    model.load_state_dict(state_dict)
    
    # Save local directory
    model.save_pretrained('path/to/save/transformers_model')
    
    # Now, you can load the Transformers model using from_pretrained
    loaded_model = GPT2LMHeadModel.from_pretrained('path/to/save/transformers_model')
    

    Extra Ball:

    Similar discussion on GitHub: https://github.com/huggingface/transformers/issues/7849