python machine-learning computer-vision huggingface-transformers

Is it possible to load huggingface model which does not have config.json file?

I am trying to load this semantic segmentation model from HF using the following code:

from transformers import pipeline

model = pipeline("image-segmentation", model="Carve/u2net-universal", device="cpu")

But I get the following error:

OSError: tamnvcc/isnet-general-use does not appear to have a file named config.json. Checkout 'https://huggingface.co/tamnvcc/isnet-general-use/main' for available files.

Is it even possible to load models from HuggingFace without config.json file provided?

I also tried loading the model via:

id2label = {0: "background", 1: "target"}
label2id = {"background": 0, "target": 1}
image_processor = AutoImageProcessor.from_pretrained("Carve/u2net-universal")
model = AutoModelForSemanticSegmentation("Carve/u2net-universal", id2label=id2label, label2id=label2id)

But got the same error.

Solution

TL;DR

You will need to make a lot of assumption if you don't have the config.json and the model card doesn't have any documentation

After some guessing, possibly it's this:

from u2net import U2NET
import torch

model = U2NET()

model.load_state_dict(torch.load('full_weights.pth', map_location=torch.device('cpu')))

In Long

Looking at the files available in the model card, we see these files:

.gitattributes
README.md
full_weights.pth

A good guess would be that the .pth file is a PyTorch model binary. Given that, we can try:

import shutil
import requests

import torch


# Download the .pth file locally
url = "https://huggingface.co/Carve/u2net-universal/resolve/main/full_weights.pth"
response = requests.get(url, stream=True)
with open('full_weights.pth', 'wb') as out_file:
    shutil.copyfileobj(response.raw, out_file)

model = torch.load('full_weights.pth', map_location=torch.device('cpu'))

But what you end up with is NOT a usable model, it's just the model parameters/weights (aka checkpoint file), i.e.

type(model)

[out]:

collections.OrderedDict

Looking at the layer names, it looks like a rebnconvin model that points to the https://github.com/xuebinqin/U-2-Net code:

model.keys()

[out]:

odict_keys(['stage1.rebnconvin.conv_s1.weight', 'stage1.rebnconvin.conv_s1.bias', 'stage1.rebnconvin.bn_s1.weight', 'stage1.rebnconvin.bn_s1.bias', 'stage1.rebnconvin.bn_s1.running_mean', 'stage1.rebnconvin.bn_s1.running_var', 'stage1.rebnconv1.conv_s1.weight', 'stage1.rebnconv1.conv_s1.bias', 'stage1.rebnconv1.bn_s1.weight', 'stage1.rebnconv1.bn_s1.bias', 'stage1.rebnconv1.bn_s1.running_mean', 'stage1.rebnconv1.bn_s1.running_var', ...])

ASSUMING THAT YOU CAN TRUST THE CODE from the github, you can try installing it with:

! wget https://raw.githubusercontent.com/xuebinqin/U-2-Net/master/model/u2net.py

And guessing from the layer names and model name, it looks like a U2Net from https://arxiv.org/abs/2005.09007v3

So you can try:

from u2net import U2NET

model = U2NET()

model.load_state_dict(torch.load('full_weights.pth', map_location=torch.device('cpu')))