Search code examples
pythonpython-3.6spacyazure-machine-learning-serviceoserror

After installing scrubadub_spacy package, spacy.load("en_core_web_sm") not working OSError: [E053] Could not read config.cfg


I am getting the below error when I'm trying to run the following line of code to load en_core_web_sm in the Azure Machine Learning instance.

I debugged the issue and found out that once I install scrubadub_spacy, that seems is the issue causing the error.

spacy.load("en_core_web_sm")
OSError                                   Traceback (most recent call last)
<ipython-input-2-c6e652d70518> in <module>
     1 # Load English tokenizer, tagger, parser and NER
----> 2 nlp = spacy.load("en_core_web_sm")

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/__init__.py in load(name, vocab, disable, exclude, config)
    50     """
    51     return util.load_model(
---> 52         name, vocab=vocab, disable=disable, exclude=exclude, config=config
    53     )
    54 

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model(name, vocab, disable, exclude, config)
   418             return get_lang_class(name.replace("blank:", ""))()
   419         if is_package(name):  # installed as package
--> 420             return load_model_from_package(name, **kwargs)  # type: ignore[arg-type]
   421         if Path(name).exists():  # path to model data directory
   422             return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_package(name, vocab, disable, exclude, config)
   451     """
   452     cls = importlib.import_module(name)
--> 453     return cls.load(vocab=vocab, disable=disable, exclude=exclude, config=config)  # type: ignore[attr-defined]
   454 
   455 

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/en_core_web_sm/__init__.py in load(**overrides)
    10 
    11 def load(**overrides):
---> 12     return load_model_from_init_py(__file__, **overrides)

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_init_py(init_file, vocab, disable, exclude, config)
   619         disable=disable,
   620         exclude=exclude,
--> 621         config=config,
   622     )
   623 

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_model_from_path(model_path, meta, vocab, disable, exclude, config)
   485     config_path = model_path / "config.cfg"
   486     overrides = dict_to_dot(config)
--> 487     config = load_config(config_path, overrides=overrides)
   488     nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude)
   489     return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/spacy/util.py in load_config(path, overrides, interpolate)
   644     else:
   645         if not config_path or not config_path.exists() or not config_path.is_file():
--> 646             raise IOError(Errors.E053.format(path=config_path, name="config.cfg"))
   647         return config.from_disk(
   648             config_path, overrides=overrides, interpolate=interpolate

OSError: [E053] Could not read config.cfg from /anaconda/envs/azureml_py36/lib/python3.6/site-packages/en_core_web_sm/en_core_web_sm-2.3.1/config.cfg

I installed the packages using the below three lines codes from Spacy

pip install -U pip setuptools wheel
pip install -U spacy
python -m spacy download en_core_web_sm

How should I fix this issue? thanks in advance.


Solution

  • Taking the path from your error message:

    en_core_web_sm-2.3.1/config.cfg
    

    You have a model for v2.3, but it's looking for a config.cfg, which is only a thing in v3 of spaCy. It looks like you upgraded spaCy without realizing it.

    There are two ways to fix this. One is to reinstall the model with spacy download, which will get a version that matches your current spaCy version. If you are just starting something that is probably the best idea. Based on the release date of scrubadub, it seems to be intended for use with spaCy v3.

    However, note that v2 and v3 are pretty different - if you have a project with v2 of spaCy you might want to downgrade instead.