Search code examples
nlphuggingface-transformers

How to set the label names when using the Huggingface TextClassificationPipeline?


I am using a fine-tuned Huggingface model (on my company data) with the TextClassificationPipeline to make class predictions. Now the labels that this Pipeline predicts defaults to LABEL_0, LABEL_1 and so on. Is there a way to supply the label mappings to the TextClassificationPipeline object so that the output may reflect the same?

Env:

  • tensorflow==2.3.1
  • transformers==4.3.2

Sample Code:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'  # or any {'0', '1', '2'}

from transformers import TextClassificationPipeline, TFAutoModelForSequenceClassification, AutoTokenizer

MODEL_DIR = "path\to\my\fine-tuned\model"

# Feature extraction pipeline
model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_DIR)
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)

pipeline = TextClassificationPipeline(model=model,
                                      tokenizer=tokenizer,
                                      framework='tf',
                                      device=0)

result = pipeline("It was a good watch. But a little boring.")[0]

Output:

In [2]: result
Out[2]: {'label': 'LABEL_1', 'score': 0.8864616751670837}

Solution

  • The simplest way is to add such a mapping is to edit the config.json of the model to contain: id2label field as below:

    {
      "_name_or_path": "distilbert-base-uncased",
      "activation": "gelu",
      "architectures": [
        "DistilBertForMaskedLM"
      ],
      "id2label": [
        "negative",
        "positive"
      ],
      "attention_dropout": 0.1,
      .
      .
    }
    
    

    A in-code way to set this mapping is by adding the id2label param in the from_pretrained call as below:

    model = TFAutoModelForSequenceClassification.from_pretrained(MODEL_DIR, id2label={0: 'negative', 1: 'positive'})
    

    Here is the Github Issue I raised for this to get added into the Documentation of transformers.XForSequenceClassification.