Search code examples
pythonpython-3.xnlpspacy

re enabling parser component of spacy give error


I am currently trying to speed up my application by removing extra spaCy component when they are not needed and enabling them at later point of time. I have come-up with this code.

import spacy
nlp = spacy.load("en_core_web_lg", disable=('ner', 'textcat'))
nlp.pipe_names

It given the following output

['tagger', 'parser']

I have to perform a task, below is the code snippet

text = """Extracts the selected  layers in the specified area of interest.... """

doc = nlp(text)

def get_pos(remove_parser=True):
    if remove_parser: 
        nlp.remove_pipe("parser")

    for kw in keywords:
        doc = nlp(kw[0])
        tag_list = [(token.text, token.tag_) for token in doc]

    if remove_parser:
        nlp.add_pipe(nlp.create_pipe('parser'))

    return tag_list

result = get_pos(remove_parser=True)
nlp.pipe_names

So I call the get_pos function with remove_parser=True. It removes the parser component, run nlp(kw[0]) for every item in keywords list. After the loop ends I add back the parser component, which can be verified by the output of the nlp.pipe_names command. I get the below output

['tagger', 'parser']   

But then if I call nlp("Hello World") after the get_pos function call. It gives this error -

ValueError                                Traceback (most recent call last)
<ipython-input-29-320b76b1fe36> in <module>
----> 1 nlp("Hello World")

~\.conda\envs\keyword-extraction\lib\site-packages\spacy\language.py in __call__(self, text, disable, component_cfg)
    433             if not hasattr(proc, "__call__"):
    434                 raise ValueError(Errors.E003.format(component=type(proc), name=name))
--> 435             doc = proc(doc, **component_cfg.get(name, {}))
    436             if doc is None:
    437                 raise ValueError(Errors.E005.format(name=name))

nn_parser.pyx in spacy.syntax.nn_parser.Parser.__call__()

nn_parser.pyx in spacy.syntax.nn_parser.Parser.predict()

nn_parser.pyx in spacy.syntax.nn_parser.Parser.require_model()

ValueError: [E109] Model for component 'parser' not initialized. Did you forget to load a model, or forget to call begin_training()?

Solution

  • You are trying to add a blank/untrained parser back to the pipeline rather the one that was provided with it. Instead, try disable_pipes(), which makes it easier to save the component and add it back later:

    disabled = nlp.disable_pipes(["parser"])
    # do stuff
    disabled.restore()
    

    See: https://spacy.io/api/language#disable_pipes