Search code examples
pythonmachine-learningcomputer-visionhuggingface-transformers

Is it possible to load huggingface model which does not have config.json file?


I am trying to load this semantic segmentation model from HF using the following code:

from transformers import pipeline

model = pipeline("image-segmentation", model="Carve/u2net-universal", device="cpu")

But I get the following error:

OSError: tamnvcc/isnet-general-use does not appear to have a file named config.json. Checkout 'https://huggingface.co/tamnvcc/isnet-general-use/main' for available files.

Is it even possible to load models from HuggingFace without config.json file provided?

I also tried loading the model via:

id2label = {0: "background", 1: "target"}
label2id = {"background": 0, "target": 1}
image_processor = AutoImageProcessor.from_pretrained("Carve/u2net-universal")
model = AutoModelForSemanticSegmentation("Carve/u2net-universal", id2label=id2label, label2id=label2id)

But got the same error.


Solution

  • TL;DR

    You will need to make a lot of assumption if you don't have the config.json and the model card doesn't have any documentation

    After some guessing, possibly it's this:

    from u2net import U2NET
    import torch
    
    model = U2NET()
    
    model.load_state_dict(torch.load('full_weights.pth', map_location=torch.device('cpu')))
    

    In Long

    Looking at the files available in the model card, we see these files:

    • .gitattributes
    • README.md
    • full_weights.pth

    A good guess would be that the .pth file is a PyTorch model binary. Given that, we can try:

    import shutil
    import requests
    
    import torch
    
    
    # Download the .pth file locally
    url = "https://huggingface.co/Carve/u2net-universal/resolve/main/full_weights.pth"
    response = requests.get(url, stream=True)
    with open('full_weights.pth', 'wb') as out_file:
        shutil.copyfileobj(response.raw, out_file)
    
    model = torch.load('full_weights.pth', map_location=torch.device('cpu'))
    

    But what you end up with is NOT a usable model, it's just the model parameters/weights (aka checkpoint file), i.e.

    type(model)
    

    [out]:

    collections.OrderedDict
    

    Looking at the layer names, it looks like a rebnconvin model that points to the https://github.com/xuebinqin/U-2-Net code:

    model.keys()
    

    [out]:

    odict_keys(['stage1.rebnconvin.conv_s1.weight', 'stage1.rebnconvin.conv_s1.bias', 'stage1.rebnconvin.bn_s1.weight', 'stage1.rebnconvin.bn_s1.bias', 'stage1.rebnconvin.bn_s1.running_mean', 'stage1.rebnconvin.bn_s1.running_var', 'stage1.rebnconv1.conv_s1.weight', 'stage1.rebnconv1.conv_s1.bias', 'stage1.rebnconv1.bn_s1.weight', 'stage1.rebnconv1.bn_s1.bias', 'stage1.rebnconv1.bn_s1.running_mean', 'stage1.rebnconv1.bn_s1.running_var', ...])
    

    ASSUMING THAT YOU CAN TRUST THE CODE from the github, you can try installing it with:

    ! wget https://raw.githubusercontent.com/xuebinqin/U-2-Net/master/model/u2net.py
    

    And guessing from the layer names and model name, it looks like a U2Net from https://arxiv.org/abs/2005.09007v3

    So you can try:

    from u2net import U2NET
    
    model = U2NET()
    
    model.load_state_dict(torch.load('full_weights.pth', map_location=torch.device('cpu')))