i need help with rule based matcher in spacy. I have this code:
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
# Add match ID "HelloWorld" with no callback and one pattern
pattern = [{"LOWER": "hello"}, {"IS_PUNCT": True}, {"LOWER": "world"}]
pattern = [{"LOWER": "Good"}, {"IS_PUNCT": True}, {"LOWER": "night"}]
matcher.add("HelloWorld", [pattern])
doc = nlp("Hello, world! Hello world!")
matches = matcher(doc)
for match_id, start, end in matches:
string_id = nlp.vocab.strings[match_id] # Get string representation
span = doc[start:end] # The matched span
print(match_id, string_id, start, end, span.text)
Everything works well I get the match_id,string_id etc... but i'm asking myself if it's possible to get the pattern corresponding to the matched span:
Essentially i want to know if it's possible to get the pattern corresponding to the match in spacy:
For example in my example,
[{"LOWER": "hello"}, {"IS_PUNCT": True}, {"LOWER": "world"}]
is the corresponding match for my example.
Thank you very much
If multiple patterns are added with the same label you can't find which pattern matched after the fact.
There are a couple of things you can do. A very simple one is to use different labels for each pattern. Another option is to use pattern IDs with the EntityRuler.