Search code examples
pythonmachine-learningdeep-learninghuggingface-transformershuggingface

How to specify number of target classes for TFRobertaSequenceClassification?


I have a text classification task at hand and I want to use RoBERTa pre-trained model from the transformers python library.

As per the documentation of TFRobertaForSequenceClassification to train we have to use,

from transformers import RobertaTokenizer, TFRobertaForSequenceClassification

tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
model = TFRobertaForSequenceClassification.from_pretrained('roberta-base')

model.compile('adam', loss='sparse_categorical_crossentropy')
model.fit(x, y)

So where should I specify the number of target labels for sequence classification?


Solution

  • You can use num_labels parameter.

    model = TFRobertaForSequenceClassification.from_pretrained('roberta-base', num_labels = 5)
    

    ref: https://huggingface.co/transformers/main_classes/configuration.html