Search code examples
pythonnlptext-extraction

Extract Nouns From Dataframe and Store them into another Row


I'm practicing NLP and have a problem. I have a dataset containing rows of sentences. Pos-tagging of each row was easy. Now I want to extract nouns from those rows and store them in another column in respective rows.

nouns = []
tags = data['Pos Tags']
for i in tags:
for (word,tag) in i:
  if tag == 'NN'
    nouns.append(word)

Here is the Example of Code and After this, I Don't know how to store these nouns in respective rows in another column.

[Here is the content of the dataset] 1.


Solution

  • I suggest using chain notation instead of creating separate functions for the sake of readability.

    df = pd.DataFrame({
        'Text':['This is a simple test', 'This is another test sentence']
    })
    df['POS Tagged Text'] = df['Text'].apply(lambda item:item.strip().split()).apply(pos_tag)
    df['Just Nouns Text'] = df['POS Tagged Text'].apply(lambda item:[w for w,t in item if t=='NN'])
    print(df['Just Nouns Text'])
    

    output:

    0              [test]
    1    [test, sentence]