I want to match a repeating pattern using spaCy's pattern matcher. Following is the pattern that i want to match:
My account number is: 2893-26492-634-0924-63. Some more text here.
Basically, trying to match the following regex: \d+(-\d+)*
matcher = Matcher(nlp.vocab)
matcher.add('NUMBER_MERGE', None, [ {'IS_DIGIT': True}, {'IS_PUNCT': True}, {'IS_DIGIT': True}, {'IS_SPACE':True}])
This matches 342-234 Text
, however fails for 342-234-958 Text
.
I did not find any documentation to apply quantifiers on a set of operators. Any help would be appreciated.
You can directly use the regex as a pattern.
matcher.add('NUMBER_MERGE', None, [{"TEXT": {"REGEX": "\d+(-\d+)*"}}])