Search code examples
pythonnlphuggingface-transformers

how to load deberta-v3 properly


I am trying to fine tune a DeBERTa model for a regression task, the problem is that when I load the model using this code

from transformers import AutoConfig, AutoTokenizer, AutoModel
## Model Configurations
MODEL_NAME = 'microsoft/deberta-v3-base'

config = AutoConfig.from_pretrained(MODEL_NAME) ## Configuration loaded from AutoConfig
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME) ## Tokenizer loaded from AutoTokenizer

I get a KeyError

Traceback (most recent call last)
<ipython-input-23-62561c3f4e7b> in <module>
      3 MODEL_NAME = 'microsoft/deberta-v3-base'
      4 
----> 5 config = AutoConfig.from_pretrained(MODEL_NAME) ## Configuration loaded from AutoConfig
      6 tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME) ## Tokenizer loaded from AutoTokenizer

/usr/lib/python3.8/dist-packages/transformers/models/auto/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    350 
    351         if "model_type" in config_dict:
--> 352             config_class = CONFIG_MAPPING[config_dict["model_type"]]
    353             return config_class.from_dict(config_dict, **kwargs)
    354         else:

KeyError: 'deberta-v2'

what can be the problem? I am using transformers version 4.31.0


Solution

  • It seems V3 config structure is the same with V2 config?

    DebertaV2Config {
      "_name_or_path": "microsoft/deberta-v3-base",
      "attention_probs_dropout_prob": 0.1,
      "hidden_act": "gelu",
      "hidden_dropout_prob": 0.1,
    ...
    

    I successfully ran the following code.

    from transformers import AutoModel, AutoModel, AutoConfig
    MODEL_NAME = 'microsoft/deberta-v3-base'
    model = AutoModel.from_pretrained(MODEL_NAME)
    config = AutoConfig.from_pretrained(MODEL_NAME)
    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
    

    Output:

    Downloading (…)lve/main/config.json: 100%
    579/579 [00:00<00:00, 8.75kB/s]
    Downloading pytorch_model.bin: 100%
    371M/371M [00:02<00:00, 152MB/s]
    Downloading (…)okenizer_config.json: 100%
    52.0/52.0 [00:00<00:00, 1.43kB/s]
    Downloading spm.model: 100%
    2.46M/2.46M [00:00<00:00, 42.9MB/s]
    Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
    /usr/local/lib/python3.10/dist-packages/transformers/convert_slow_tokenizer.py:473: UserWarning: The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.
      warnings.warn(
    Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
    

    In Google Colab (or any Linux OS), the model files would be stored here.
    /root/.cache/huggingface/hub/models--microsoft--deberta-v3-base


    It also can be downloaded via V2 modules without error message.
    In this case, sentencepiece needs to be imported.
    Not sure whether the model will be usable.

    import sentencepiece
    from transformers import DebertaV2Model, DebertaV2Config, DebertaV2Tokenizer
    MODEL_NAME = 'microsoft/deberta-v3-base'
    model = DebertaV2Model.from_pretrained(MODEL_NAME)
    config = DebertaV2Config.from_pretrained(MODEL_NAME)
    tokenizer = DebertaV2Tokenizer.from_pretrained(MODEL_NAME)