I want to evaluate my trained spaCy model with the build-in Scorer function with this code:
def evaluate(ner_model, examples):
scorer = Scorer()
for input_, annot in examples:
text = nlp.make_doc(input_)
gold = Example.from_dict(text, annot)
pred_value = ner_model(input_)
scorer.score(gold)
return scorer.scores
examples = [('Brief discussion about instument replcement and Product ...confirmation', {'entities': [(48, 55, 'PRODUCT')]})('Met with special chem lead. Did not yet move assays from immulite to produc. Follow up with PhD tomorrow.', {'entities': [(57, 68, 'PRODUCT'), (45, 51, 'DATE'), (97, 105, 'DATE')]}), ('Discuss new products for ...', {'entities': [(36, 51, 'PRODUCT')]})]
ner_model = spacy.load(r'D:\temp\model') # for spaCy's pretrained use 'en_core_web_sm'
results = evaluate(ner_model, examples)
When I run the function I'm receiving the following error message:
TypeError: [E978] The Tokenizer.score method takes a list of Example objects, but got: <class 'spacy.training.example.Example'>
I already tried feeding in the annotations like {"entities": annot} and some other versions of it. I checked google but every article seems to be related to version 2.xx of spaCy.
What am I doing wrong? How can I calculate recall, accuracy, and the F1 score with spacy Score()?
The scores method is still supported in spaCy 3.0 (https://spacy.io/api/scorer) and I finally got it working with the following code:
nlp = spacy.load(path_to_model)
examples = []
scorer = Scorer()
for text, annotations in TEST_REVISION_DATA:
doc = nlp.make_doc(text)
example = Example.from_dict(doc, annotations)
example.predicted = nlp(str(example.predicted))
examples.append(example)
scorer.score(examples)
I didn't find the command line tool easy to apply (I was struggling with the test data load) and for my needs, the code version is also much more convenient. This way I can covert the results into visuals easily.