python nlp spacy matcher named-entity-recognition

Spacy dependencymatcher pattern not returning matches

I am trying to create, add and get results from a pattern using spacy DependencyMatcher.

I created a pattern for the sentence: "From Monday to Friday"

The full pattern:

pattern = [
    {
        "RIGHT_ID": "node0",
        "RIGHT_ATTRS": {'DEP': 'ROOT', 'POS': 'ADP', 'TAG': 'IN'}
    },
    {
        "LEFT_ID": "node0",
        "REL_OP": ">",
        "RIGHT_ID": "node1",
        "RIGHT_ATTRS": {'DEP': 'pobj', 'POS': 'PROPN', 'TAG': 'NNP'},
    },
    {
        "LEFT_ID": "node1",
        "REL_OP": "$--",
        "RIGHT_ID": "node2",
        "RIGHT_ATTRS": {'DEP': 'prep', 'POS': 'ADP', 'TAG': 'IN'},
    },
       {
        "LEFT_ID": "node2",
        "REL_OP": ">",
        "RIGHT_ID": "node3",
        "RIGHT_ATTRS":{'DEP': 'pobj', 'POS': 'PROPN', 'TAG': 'NNP'},
    },
    
]

The simpler pattern is :

pattern = [
    {
        "RIGHT_ID": "node0",
        "RIGHT_ATTRS": {"POS": "ADP"}
    },
    {
        "LEFT_ID": "node0",
        "REL_OP": ">",
        "RIGHT_ID": "node1",
        "RIGHT_ATTRS": {"POS": "PROPN"},
    },
    {
        "LEFT_ID": "node1",
        "REL_OP": "$--",
        "RIGHT_ID": "node2",
        "RIGHT_ATTRS": {"POS": "ADP"},
    },
       {
        "LEFT_ID": "node2",
        "REL_OP": ">",
        "RIGHT_ID": "node3",
        "RIGHT_ATTRS":{'POS': 'PROPN'},
    },
    
]

My question is, why is this pattern not giving any matches, not on the full or simpler pattern?

import spacy
from spacy.matcher import DependencyMatcher


nlp = spacy.load("en_core_web_sm")
matcher = DependencyMatcher(nlp.vocab)


text="From monday to friday"
doc = nlp(text)
matcher.add("pattern1", [pattern])

matches = matcher(doc)

# Each token_id corresponds to one pattern dict
match_id, token_ids = matches[0]

spacy versions:

spaCy v3.0.6

NAME SPACY VERSION

en_core_web_sm >=3.0.0,<3.1.0 3.0.0 ✔

Solution

Your REL_OP for node2 is backwards. It should be $++.

To give a full explanation, this code works for me.

import spacy

from spacy.matcher import DependencyMatcher

nlp = spacy.load("en_core_web_sm")
matcher = DependencyMatcher(nlp.vocab)

text="From Monday to Friday"
doc = nlp(text)

pattern = [
    {
        "RIGHT_ID": "node0",
        "RIGHT_ATTRS": {'POS': 'ADP', 'TAG': 'IN'}
    },
    {
        "LEFT_ID": "node0",
        "REL_OP": ">",
        "RIGHT_ID": "node1",
        "RIGHT_ATTRS": {'POS': 'PROPN'},
    },
    {
        "LEFT_ID": "node1",
        "REL_OP": "$++",
        "RIGHT_ID": "node2",
        "RIGHT_ATTRS": {'POS': 'ADP'},
    },
       {
        "LEFT_ID": "node2",
        "REL_OP": ">",
        "RIGHT_ID": "node3",
        "RIGHT_ATTRS":{'POS': 'PROPN'},
    },
    
]

matcher.add("pattern1", [pattern])

matches = matcher(doc)
print(matches)

print("-----")
# this part is just for reference
for word in doc:
    print(word.pos_, word.tag_, word.dep_, word, sep="\t")

Couple of points about this:

your second pattern is better, you shouldn't need to specify tag and pos for English (tag determines pos)
In the v3 small model "monday" and "friday" are not proper nouns unless capitalized (it looks like your displaCy output is from the public demo, which uses v2)