Search code examples
pythontensorflowkerastensorflow2.0huggingface-transformers

Problem building tensorflow model from huggingface weights


I need to work with the pretrained BERT model ('dbmdz/bert-base-italian-xxl-cased') from Huggingface with Tensorflow (at this link).

After reading this on the website,

Currently only PyTorch-Transformers compatible weights are available. If you need access to TensorFlow checkpoints, please raise an issue!

I raised the issue and promptly a download link to an archive containing the following files was given to me. The files are the following ones:

$ ls bert-base-italian-xxl-cased/
config.json                    model.ckpt.index               vocab.txt
model.ckpt.data-00000-of-00001 model.ckpt.meta

I'm now trying to load the model and work with it but everything I tried failed.

I tried following this suggestion from an Huggingface discussion site:

bert_folder = str(Config.MODELS_CONFIG.BERT_CHECKPOINT_DIR) # folder in which I have the files extracted from the archive
from transformers import BertConfig, TFBertModel
config = BertConfig.from_pretrained(bert_folder) # this gets loaded correctly

After this point I tried several combinations in order to load the model but always unsuccessfully.

eg:

model = TFBertModel.from_pretrained("../../models/pretrained/bert-base-italian-xxl-cased/model.ckpt.index", config=config)

model = TFBertModel.from_pretrained("../../models/pretrained/bert-base-italian-xxl-cased/model.ckpt.index", config=config, from_pt=True)

model = TFBertModel.from_pretrained("../../models/pretrained/bert-base-italian-xxl-cased/model.ckpt.index", config=config, from_pt=True)

model = TFBertModel.from_pretrained("../../models/pretrained/bert-base-italian-xxl-cased", config=config, local_files_only=True)

Always results in this error:

404 Client Error: Not Found for url: https://huggingface.co/models/pretrained/bert-base-italian-xxl-cased/model.ckpt.index/resolve/main/tf_model.h5
...
...
OSError: Can't load weights for '../../models/pretrained/bert-base-italian-xxl-cased/model.ckpt.index'. Make sure that:

- '../../models/pretrained/bert-base-italian-xxl-cased/model.ckpt.index' is a correct model identifier listed on 'https://huggingface.co/models'

- or '../../models/pretrained/bert-base-italian-xxl-cased/model.ckpt.index' is the correct path to a directory containing a file named one of tf_model.h5, pytorch_model.bin.

So my question is: How can I load this pre-trained BERT model from those files and use it in tensorflow?


Solution

  • You can try the following snippet to load dbmdz/bert-base-italian-xxl-cased in tensorflow.

    from transformers import AutoTokenizer, TFBertModel
    model_name = "dbmdz/bert-base-italian-cased"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = TFBertModel.from_pretrained(model_name)
    

    If you want to load from the given tensorflow checkpoint, you could try like this:

    model = TFBertModel.from_pretrained("../../models/pretrained/bert-base-italian-xxl-cased/model.ckpt.index", config=config, from_tf=True)