I am currently working on NER project and I would like to improve my NER performance by trying new SpaCy model en_trf_bertbaseuncased_lg
but it gave me error KeyError: "[E001] No component 'trf_tok2vec' found in pipeline. Available names: ['ner']"
. Is it that SpaCy currently does not support NER for this language model? Thanks!
# get names of other pipes to disable them during training
other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']
with nlp.disable_pipes(*other_pipes): # only train NER
for itn in tqdm(range(n_iter)):
random.shuffle(train_data_list)
losses = {}
# batch up the examples using spaCy's minibatch
batches = minibatch(train_data_list, size=compounding(8., 64., 1.001))
for batch in batches:
texts, annotations = zip(*batch)
nlp.update(texts, annotations, sgd=optimizer, drop=0.35,
losses=losses)
tqdm.write('Iter: ' + str(itn + 1) + ' Losses: ' + str(losses['ner']))
if itn == 30 or itn == 40:
output_dir = Path(output_dir)
if not output_dir.exists():
output_dir.mkdir()
nlp.to_disk(Path(output_dir))
It gave error on
nlp.update(texts, annotations, sgd=optimizer, drop=0.35,
losses=losses)
According to the documentation of this model on spaCy here, this model doesn't support Named-Entity Recognition yet. It only supports:
sentencizer
trf_wordpiecer
trf_tok2vec
You can get the available pipe for a given model like so:
>>> import spacy
>>> nlp = spacy.load("en_trf_bertbaseuncased_lg")
>>> nlp.pipe_names
[sentencizer, trf_wordpiecer, trf_tok2vec]