The pattern works with matcher. But not as an entity? Here is my code:
import spacy
from spacy.pipeline import EntityRuler
nlp = spacy.load("en_core_web_sm")
patterns = [
{
"label": "PHONE_NUMBER",
"pattern": [
{"ORTH": "("},
{"SHAPE": "ddd"},
{"ORTH": ")"},
{"IS_SPACE": True, "OP": "?"},
{"SHAPE": "ddd"},
{"ORTH": "-"},
{"SHAPE": "dddd"},
],
}
]
entity_ruler = EntityRuler(nlp, patterns=patterns, overwrite_ents=True)
nlp.add_pipe("entity_ruler", before="ner")
doc = nlp("You can reach me at (111) 111-1111.")
for ent in doc.ents:
print(ent.text, ent.label_)
This returns:
111 CARDINAL
111 CARDINAL
Advice/help needed and appreciated. Thank you.
The problem was two seperate components—one constructed with the class EntityRuler
and one constructed with nlp.add_pipe
. The component created with the add_pipe
method wasn't aware of your patterns. Using just one method and then adding the patterns to that component did the trick.
import spacy
nlp = spacy.load("en_core_web_sm")
patterns = [
{
"label": "PHONE_NUMBER",
"pattern": [
{"ORTH": "("},
{"SHAPE": "ddd"},
{"ORTH": ")"},
{"IS_SPACE": True, "OP": "?"},
{"SHAPE": "ddd"},
{"ORTH": "-"},
{"SHAPE": "dddd"},
],
}
]
ruler = nlp.add_pipe("entity_ruler", before="ner")
ruler.add_patterns(patterns)
doc = nlp("You can reach me at (111) 111-1111.")
for ent in doc.ents:
print(ent.text, ent.label_)
(111) 111-1111 PHONE_NUMBER
I read about the different ways to initialize the component here: https://spacy.io/api/entityruler