I'm practicing NLP and have a problem. I have a dataset containing rows of sentences. Pos-tagging of each row was easy. Now I want to extract nouns from those rows and store them in another column in respective rows.
nouns = []
tags = data['Pos Tags']
for i in tags:
for (word,tag) in i:
if tag == 'NN'
nouns.append(word)
Here is the Example of Code and After this, I Don't know how to store these nouns in respective rows in another column.
[Here is the content of the dataset] 1.
I suggest using chain notation instead of creating separate functions for the sake of readability.
df = pd.DataFrame({
'Text':['This is a simple test', 'This is another test sentence']
})
df['POS Tagged Text'] = df['Text'].apply(lambda item:item.strip().split()).apply(pos_tag)
df['Just Nouns Text'] = df['POS Tagged Text'].apply(lambda item:[w for w,t in item if t=='NN'])
print(df['Just Nouns Text'])
output:
0 [test]
1 [test, sentence]