python machine-learning nlp bert-language-model huggingface-transformers

Train BERT with CLI commands

I have downloaded the HuggingFace BERT model from the transformer repository found here and would like to train the model on custom NER labels by using the run_ner.py script as it is referenced here in the section "Named Entity Recognition".

I define model ("bert-base-german-cased"), data_dir ("Data/sentence_data.txt") and labels ("Data/labels.txt)" as defaults in the code.

Now I'm using this input for the command line:

python run_ner.py --output_dir="Models" --num_train_epochs=3 --logging_steps=100 --do_train --do_eval --do_predict

But all it does is telling me:

Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertForTokenClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.w
eight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-german-cased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

After that it just stops, not ending the script, but simply waiting.

Does anyone know what could be the problem here? Am I missing a parameter?

My sentence_data.txt in CoNLL format looks like this (small snippet):

Strafverfahren O
gegen O
; O
wegen O
Diebstahls O
hat O
das O
Amtsgericht Ort
Leipzig Ort
- O
Strafrichter O

And that's how I defined my labels in labels.txt:

"Date", "Delikt", "Strafe_Tatbestand", "Schadensbetrag", "Geständnis_ja", "Vorstrafe_ja", "Vorstrafe_nein", "Ort",
"Strafe_Gesamtfreiheitsstrafe_Dauer", "Strafe_Gesamtsatz_Dauer", "Strafe_Gesamtsatz_Betrag"

Solution

Found out the problem. It had to do with the CUDA driver not being compatible with the installed version of pytorch.

For anyone with Nvidia GPU encountering the same problem: going to the Nvidia control panel -> Help -> System Information -> Components, there is a setting called "NVCUDA.DLL" with a driver number in the names column. Choosing the corresponding CUDA version in the installation builder on pytorch.org should do the trick.

Also, there is a good Readme in the transformers repository explaining all steps to train the BERT model with CLI commands here.