I am trying to assign the printed output of a for-loop to a variable parsed_generics
.
This is the printed output:
import spacy
nlp = spacy.load("en")
doc = nlp(generics)
for chunk in doc.noun_chunks:
print(chunk.text, chunk.root.text, chunk.root.dep_, chunk.root.head.text)
Aerobics Aerobics nsubj is
a form form attr is
physical exercise exercise pobj of
rhythmic aerobic exercise exercise dobj combines
stretching and strength training routines routines pobj with
the goal goal pobj with
all elements elements dobj improving
...
To assign that to a variable, this is what I have written:
nlp = spacy.load("en")
doc = nlp(generics)
for chunk in doc.noun_chunks:
parsed_generics = (chunk.text, chunk.root.text, chunk.root.dep_, chunk.root.head.text)
But when I call parsed_generics
, this is what I get:
('predators', 'predators', 'pobj', 'of')
I guess what I am expecting is a list of tuples:
[('Aerobics', 'Aerobics', 'nsubj', 'is'), ('a form', 'form', 'attr', 'is'), ('physical exercise', 'exercise', 'pobj', 'of'), ...]
I guess I have to set up an empty list above my for-loop, iterate over doc
and append to the empty list, but append takes only 1 argument and I have 4 (chunk.text, chunk.root.text, chunk.root.dep_, chunk.root.head.text
)
I eventually want to store this in a DataFrame.
Any advice or suggestions would be very much appreciated. Thank you in advance.
You need to use append. You are overwriting parsed_generics
every iteration, meaning what you're seeing is the last line in the iteration.
Append each iteration to a list
, than call the list
after.
result = []
nlp = spacy.load("en")
doc = nlp(generics)
for chunk in doc.noun_chunks:
result.append((chunk.text, chunk.root.text, chunk.root.dep_, chunk.root.head.text))