I am getting the below error when I'm trying to run the following line of code to load en_core_web_sm in the Azure Machine Learning instance.
I debugged the issue and found out that once I install scrubadub_spacy, that seems is the issue causing the error.
spacy.load("en_core_web_sm")
OSError Traceback (most recent call last)
<ipython-input-2-c6e652d70518> in <module>
1 # Load English tokenizer, tagger, parser and NER
----> 2 nlp = spacy.load("en_core_web_sm")
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/__init__.py in load(name, vocab, disable, exclude, config)
50 """
51 return util.load_model(
---> 52 name, vocab=vocab, disable=disable, exclude=exclude, config=config
53 )
54
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model(name, vocab, disable, exclude, config)
418 return get_lang_class(name.replace("blank:", ""))()
419 if is_package(name): # installed as package
--> 420 return load_model_from_package(name, **kwargs) # type: ignore[arg-type]
421 if Path(name).exists(): # path to model data directory
422 return load_model_from_path(Path(name), **kwargs) # type: ignore[arg-type]
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_package(name, vocab, disable, exclude, config)
451 """
452 cls = importlib.import_module(name)
--> 453 return cls.load(vocab=vocab, disable=disable, exclude=exclude, config=config) # type: ignore[attr-defined]
454
455
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/en_core_web_sm/__init__.py in load(**overrides)
10
11 def load(**overrides):
---> 12 return load_model_from_init_py(__file__, **overrides)
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_init_py(init_file, vocab, disable, exclude, config)
619 disable=disable,
620 exclude=exclude,
--> 621 config=config,
622 )
623
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_path(model_path, meta, vocab, disable, exclude, config)
485 config_path = model_path / "config.cfg"
486 overrides = dict_to_dot(config)
--> 487 config = load_config(config_path, overrides=overrides)
488 nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude)
489 return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_config(path, overrides, interpolate)
644 else:
645 if not config_path or not config_path.exists() or not config_path.is_file():
--> 646 raise IOError(Errors.E053.format(path=config_path, name="config.cfg"))
647 return config.from_disk(
648 config_path, overrides=overrides, interpolate=interpolate
OSError: [E053] Could not read config.cfg from /anaconda/envs/azureml_py36/lib/python3.6/site-packages/en_core_web_sm/en_core_web_sm-2.3.1/config.cfg
I installed the packages using the below three lines codes from Spacy
pip install -U pip setuptools wheel
pip install -U spacy
python -m spacy download en_core_web_sm
How should I fix this issue? thanks in advance.
Taking the path from your error message:
en_core_web_sm-2.3.1/config.cfg
You have a model for v2.3, but it's looking for a config.cfg
, which is only a thing in v3 of spaCy. It looks like you upgraded spaCy without realizing it.
There are two ways to fix this. One is to reinstall the model with spacy download
, which will get a version that matches your current spaCy version. If you are just starting something that is probably the best idea. Based on the release date of scrubadub, it seems to be intended for use with spaCy v3.
However, note that v2 and v3 are pretty different - if you have a project with v2 of spaCy you might want to downgrade instead.