python-3.x nlp huggingface-transformers logits

How to interpret logit score from Hugging face binary classification model and convert it to probability sore

I am downloading the model https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384/tree/main microsoft/Multilingual-MiniLM-L12-H384 and then using it. I am loading model using BertForSequenceClassification

https://huggingface.co/docs/transformers/model_doc/bert#:~:text=sentence%20was%20random-,BertForSequenceClassification,-class%20transformers.BertForSequenceClassification

Transformer Version: '4.11.3'

I have written the below code:

def compute_metrics(eval_pred):
    logits, labels = eval_pred
   

    predictions = np.argmax(logits, axis=-1)
    
    acc = np.sum(predictions == labels) / predictions.shape[0]
    return {"accuracy" : acc}

model = tr.BertForSequenceClassification.from_pretrained("/home/pc/minilm_model",num_labels=2)
model.to(device)

print("hello")

training_args = tr.TrainingArguments(
    output_dir='/home/pc/proj/results2',          # output directory
    num_train_epochs=10,              # total number of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=32,   # batch size for evaluation
    learning_rate=2e-5,
    warmup_steps=1000,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs
    logging_steps=1000,
    evaluation_strategy="epoch",
    save_strategy="no"
)



trainer = tr.Trainer(
    model=model,                         # the instantiated 🤗 Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_data,         # training dataset
    eval_dataset=val_data,             # evaluation dataset
    compute_metrics=compute_metrics
)

The folder is empty after I train the model.

Is it okay to pass classes=2 for binary classification?

The model last layer is simple linear connection which gives logits value. How to get its interpretation and probability score out of it? Does logit score is directly proportional to probability.?

model = tr.BertForSequenceClassification.from_pretrained("/home/pchhapolika/minilm_model",num_labels=2)

Solution

Is it okay to pass classes=2 for binary classification?

Yes.

The model last layer is simple linear connection which gives logits value. How to get its interpretation and probability score out of it? Does logit score is directly proportional to probability.?

There is direct relation between them:

probability = softmax(logits, axis=-1)

or vice versa: logits = log(probability) + const

So logits are not directly proportional to probabilities, but the relationship is monotonic.