Search code examples
pythonclassificationhuggingface-transformersbert-language-model

transformers refine-tune with different classes


I want to fine-tune a BERT-based already fine-tuned model for classification with 7 classes another time on a 16 class dataset:

MODEL_NAME_OR_PATH = 'some pretrained model for 7 class classification on huggingface repo'
model = build_model(MODEL_NAME_OR_PATH, learning_rate=LEARNING_RATE)

def build_model(model_name, learning_rate=3e-5):
    model = TFBertForSequenceClassification.from_pretrained(model_name)

    optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
    model.compile(optimizer=optimizer, loss=loss, metrics=[metric])

   return model
r = model.fit(
    train_dataset,
    validation_data=valid_dataset,
    steps_per_epoch=train_steps,
    validation_steps=valid_steps,
    epochs=EPOCHS,
    verbose=1)

As expected the model expects 7 class at the final layer, and produces the following error:

Node: 
'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
Received a label value of 9 which is outside the valid range of [0, 8).  Label values: 6 2 0 6 0 9 6 6 0 6 6 0 7 2 2 2
     [[{{node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_43224]

How should one edit the structure of the model?


Solution

  • For further references, you need to edit the final layer. In my case, as i was using tensorflow:

    model.classifier = tf.keras.layers.Dense(nunits)