df have two columns containing text. I want to transform them to corpus separately.
df
id | Description 1 |Description 2 |
-----------------------------------------------------------
1 |that book is good | better than book2 |
2 |book 2 is not better than 1 | not good |
. | . | . |
. | . | . |
. | . | . |
Consider Description 1 is the document and Description 2 is the query.
Expected Output
Corpus 1: that book is good book 2 is not better than 1..................
Corpus 2: better than book2 not good.....................
You need to join the every rows that avaliable in the column using join function and then append it.Output is in list format
corpus = []
for i in range(len(df.columns)):
corpus.append(' '.join(df.iloc[j,i] for j in range(len(df.iloc[:,i]))))