I'm tring to build a custom list of "named entities" using the entity_ruler, following also the APIs
However I'm facing a problem: can I build a named entity that reference another one also defined in the entity_ruler?
To make an example, let's say I want to build the entity Agreement
as some fixed expressions, and the entity AgreementDate
as an Agreement
followed by another expression:
can the following snipped correctly set spacy? Because the output is not what I was expecting.
patterns = [
{'label': 'Agreement', 'pattern': [{'LOWER': 'license agreement'}]},
{'label': 'Agreement', 'pattern': [{'LOWER': 'agreement'}]},
{'label': 'Agreement', 'pattern': [{'LOWER': 'commencement'}]},
{'label': 'Agreement', 'pattern': [{'LOWER': 'parties'}]},
{'label': 'AgreementDate', 'pattern': [{'ENT_TYPE': 'Agreement'}, {'LOWER': 'date'}]},
]
nlp = spacy.load('en_core_web_sm')
entity_ruler = nlp.add_pipe('entity_ruler', config={
'validate': True,
'overwrite_ents': True
})
entity_ruler.initialize(lambda: [], nlp=nlp, patterns=patterns)
for ent in nlp('''Commencement Date
license agreement date''').ents:
print(f'{ent.text:40} {ent.label_:40}')
Commencement Agreement
agreement Agreement
The entity ruler patterns only match against the annotation that is set before the entity ruler component starts running, but you can do this if you move the final pattern into a second entity ruler (use a custom component name).