Search code examples
pythonpython-3.xnlpspacynamed-entity-recognition

ValueError: [E024] Could not find an optimal move to supervise the parser


I am getting the following error while training spacy NER model with my custom training data.

ValueError: [E024] Could not find an optimal move to supervise the parser. Usually, this means the GoldParse was not correct. For example, are all labels added to the model?

Can anyone help me with this?


Solution

  • passing the training data through this function below works fine without any error.

    def trim_entity_spans(data: list) -> list:
        """Removes leading and trailing white spaces from entity spans.
    
        Args:
            data (list): The data to be cleaned in spaCy JSON format.
    
        Returns:
            list: The cleaned data.
        """
        invalid_span_tokens = re.compile(r'\s')
    
        cleaned_data = []
        for text, annotations in data:
            entities = annotations['entities']
            valid_entities = []
            for start, end, label in entities:
                valid_start = start
                valid_end = end
                while valid_start < len(text) and invalid_span_tokens.match(
                        text[valid_start]):
                    valid_start += 1
                while valid_end > 1 and invalid_span_tokens.match(
                        text[valid_end - 1]):
                    valid_end -= 1
                valid_entities.append([valid_start, valid_end, label])
            cleaned_data.append([text, {'entities': valid_entities}])
    
        return cleaned_data