Search code examples
pythonspacyspacy-3

Can't evaluate custom ner in spacy 3.0 using CLI


I'm trying to load a custom pre-trained model with custom pipeline from disk as a pipeline in spacy 3.0:

The code of the factory is like this:

@CustomEng.factory("ner-crf")
def create_my_component(nlp, name):
    crf_extractor = CRFExtractor().from_disk("path-to-model")
    return CRFEntityExtractor(nlp, crf_extractor=crf_extractor)

Then I added to 'ner-crf' to my Language class like this:

    nlp = spacy.blank('custom-eng')
    nlp.add_pipe('ner-crf')
    nlp.to_disk('../model')

There's a thing I think may be relevant: When I use to_disk in order to save the nlp object there is no ner-crf package (the pipeline I just added) in the saved object.

Then I run this CLI command to evaluate the NER pipeline:

python -m spacy evaluate ../model/ ../corpus/dev.spacy --output ../model/metrics.json --gpu-id 0 --code ../../../spacy_utils/custom-eng/__init__.py

But I get this error :

Traceback (most recent call last):
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/__main__.py", line 4, in <module>
    setup_cli()
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/cli/_util.py", line 69, in setup_cli
    command(prog_name=COMMAND)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/typer/main.py", line 497, in wrapper
    return callback(**use_params)  # type: ignore
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/cli/evaluate.py", line 42, in evaluate_cli
    evaluate(
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/cli/evaluate.py", line 75, in evaluate
    nlp = util.load_model(model)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/util.py", line 326, in load_model
    return load_model_from_path(Path(name), **kwargs)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/util.py", line 392, in load_model_from_path
    return nlp.from_disk(model_path, exclude=exclude)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/language.py", line 1883, in from_disk
    util.from_disk(path, deserializers, exclude)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/util.py", line 1176, in from_disk
    reader(path / key)
  File "/home/marzi/anaconda3/envs/spacy-tutorial/lib/python3.8/site-packages/spacy/language.py", line 1877, in <lambda>
    deserializers[name] = lambda p, proc=proc: proc.from_disk(
TypeError: from_disk() got an unexpected keyword argument 'exclude'

The custom NER classes that I used belong to spacy-crfsuite library which works fine in spacy 2 but they have no sample code for Spacy 3 yet so I'm trying to make it work in spacy 3.0 myself.


Solution

  • From spaCy v3.0 onwards, pipeline components are expected to support an exclude keyword on their to_disk method. You can add the exclude keyword to your function, give it a default, and simply not use its value in the function body, and this error should be resolved.

    For completeness, here's the migration guide for the transition from v2 to v3, which may include some additional interesting pointers for you: https://spacy.io/usage/v3#migrating