I do have 1 column and 3 rows in dataframe. The dataframe is below
Text
0 Provided by Hindustan Times Wuhan Institute of...
1 Kattappa continues to narrate how he ended up ...
2 National Commercial Bank (NCB), Saudi Arabia’s...
I'm trying to summarize all the 3 rows and want to create another column like
Text Summarize
0 Provided by Hindustan Times Wuhan Institute of... It's related to virus
1 Kattappa continues to narrate how he ended up ... It's a movie story
2 National Commercial Bank (NCB), Saudi Arabia’s... Article related to finance
I tried the below code
for index, row in df.iterrows():
chunks = generate_chunks(row['Text'])
res = summarizer(chunks, max_length=1000, min_length=20)
text = ' '.join([summ['summary_text'] for summ in res])
print(text)
But the output is
Article related to finance
Can anyone help me with this?
You overwrite the value of text
at each iteration - so it gets changed to "It's related to virus"
, then changed to "It's a movie story"
and the previous value forgotten, and finally changed to "Article related to finance"
and both the previous values forgotten.
Instead of using a single string, use a list of strings and append
to it at each iteration, like this:
summaries = []
for index, row in df.iterrows():
chunks = generate_chunks(row['Text'])
res = summarizer(chunks, max_length=1000, min_length=20)
text = ' '.join([summ['summary_text'] for summ in res])
summaries.append(text)
print(summaries)