I'm trying to add a new named entity to spacy but I couldn't have good examples of Example objects for ner training and I'm getting a value error. Here is my code:
import spacy
from spacy.util import minibatch, compounding
from pathlib import Path
from spacy.training import Example
nlp=spacy.load('en_core_web_lg')
ner=nlp.get_pipe("ner")
TRAIN_DATA=[('ABC is a worldwide organization',{'entities':[0,2,'CRORG']}),
('we stand with ABC',{'entities':[24,26,'CRORG']}),
('we supports ABC',{'entities':[15,17,'CRORG']})]
ner.add_label('CRORG')
# Disable pipeline components that dont need to change
pipe_exceptions = ["ner"]
unaffected_pipes = [pipe for pipe in nlp.pipe_names if pipe not in pipe_exceptions]
with nlp.disable_pipes(*unaffected_pipes):
for iteration in range(30):
random.shuffle(TRAIN_DATA)
for raw_text,entity_offsets in TRAIN_DATA:
doc=nlp.make_doc(raw_text)
nlp.update([Example.from_dict(doc,entity_offsets)])
The 'entitites'
in TRAIN_DATA
are supposed to be a list of tuples. They have to be 2D, not just 1D.
So instead of:
TRAIN_DATA=[('ABC is a worldwide organization',{'entities':[0,2,'CRORG']}),
('we stand with ABC',{'entities':[24,26,'CRORG']}),
('we supports ABC',{'entities':[15,17,'CRORG']})]
Use:
TRAIN_DATA=[('ABC is a worldwide organization',{'entities':[(0,2,'CRORG')]}),
('we stand with ABC',{'entities':[(24,26,'CRORG')]}),
('we supports ABC',{'entities':[(15,17,'CRORG')]})]