Search code examples
nlpspacy

How to create custom ner components in spaCy v3


I m trying to add an entityRuler but i keep getting this error :[E002] Can't find factory for 'ruler' for language French (fr), i don't know how to create a custom component for V3 and i have only found example for the older version and the documentation kinda of confused me.

pattern = [{"label": "ORG", "pattern": "Neoledge"}]
ruler.add_patterns(pattern)
nlp.add_pipe('ruler')
Edit:

@Language.component('rulerORG') 
def rulerORG(doc):     
    ORG = ["...",]     
    ruler= EntityRuler(nlp, overwrite_ents=True)     
    for O in ORG:        
        ruler.add_patterns([{"label": "ORG", "pattern": O}])     
        return doc    

nlp.add_pipe('rulerORG')

Solution

  • I guess you're trying to create an EntityRuler? If so you should write your code like this:

    import spacy
    nlp = spacy.blank("en")
    pattern = [{"label": "ORG", "pattern": "Neoledge"}]
    ruler = nlp.add_pipe('entity_ruler', config={"overwrite_ents":True})
    ruler.add_patterns(pattern)
    

    The EntityRuler and NER pipelines are different - NER is statistical, the EntityRuler is rule-based.

    The way components are added to the pipeline changed between v2 and v3 and it looks like you have a mix of code.

    You can see an example of the approach I outlined here in this part of the docs.