I have the following pandas dataframe- df1. The column sentence consists of character vectors
Sentence
Hello Xime
Work Yime
New Zime
I have used an nlp model using spacey called nlp_mod. I have created a second column after training the model as follows
df1["Col1"]=[nlp(i).ents for i in df1["Sentence"]]
Sentence Col1
Hello Xime (X,)
Work Yime (X,)
New Zime (X,)
The above code is able to separate the entities. I am unable to get the entity labels when I try
df1["lab"]=[ent.label_ for ents in df1["Col1"]]
I am getting the following error
'tuple' object has no attribute 'ent'
I request someone to guide me suitably as to where my error is.
nlp(i).ents
returns a tuple of identified entities. So you have to loop through the identified entities to retrieve their properties.
[[(ent.label_) for ent in ents] for ents in df["Col1"]]
Working Example:
df = pd.DataFrame({ 'Sentence': ["Hello Xime", "iPhone is an Apple Phone", "New Delhi is the capital of India"] })
nlp = spacy.load("en_core_web_sm")
df["Col1"] = [nlp(i).ents for i in df["Sentence"]]
df["lab"] = [[(ent.label_) for ent in ents] for ents in df["Col1"]]
print (df)
Output:
Sentence Col1 lab
0 Hello Xime () []
1 iPhone is an Apple Phone ((iPhone), (Apple)) [ORG, ORG]
2 New Delhi is the capital of India ((New, Delhi), (India)) [GPE, GPE]