Search code examples
pythonpandasnlp

spacey model for nlp in python not yielding entity the label


I have the following pandas dataframe- df1. The column sentence consists of character vectors

       Sentence                   
       Hello Xime    
       Work  Yime    
       New   Zime   

I have used an nlp model using spacey called nlp_mod. I have created a second column after training the model as follows

       df1["Col1"]=[nlp(i).ents for i  in df1["Sentence"]]
       Sentence      Col1              
       Hello Xime    (X,)
       Work  Yime    (X,)
       New   Zime    (X,)

The above code is able to separate the entities. I am unable to get the entity labels when I try

  df1["lab"]=[ent.label_ for ents   in df1["Col1"]]

I am getting the following error

 'tuple' object has no attribute 'ent'

I request someone to guide me suitably as to where my error is.


Solution

  • nlp(i).ents returns a tuple of identified entities. So you have to loop through the identified entities to retrieve their properties.

    [[(ent.label_) for ent in ents] for ents in df["Col1"]]
    

    Working Example:

    df = pd.DataFrame({ 'Sentence': ["Hello Xime", "iPhone is an Apple Phone", "New Delhi is the capital of India"] })
    nlp = spacy.load("en_core_web_sm")
    df["Col1"] = [nlp(i).ents for i  in df["Sentence"]]
    df["lab"] = [[(ent.label_) for ent in ents] for ents in df["Col1"]]
    print (df)
    

    Output:

        Sentence                             Col1                   lab
    0   Hello Xime                           ()                     []
    1   iPhone is an Apple Phone             ((iPhone), (Apple))    [ORG, ORG]
    2   New Delhi is the capital of India   ((New, Delhi), (India)) [GPE, GPE]