Search code examples
pythonexport-to-csvtext-miningldatopic-modeling

Merge several txt. files with multiple lines to one csv file (1 line = 1 document) for Topic Modeling


I have 30 text files so far which all have multiple lines. I want to apply a LDA Model based on this tutorial . So, for me it should look this:

text of document1
text of document2
text of document3 
.....
text of document30

But the whole text of a specific document has to be on one line.

I tried this post and for some reason it keeps saying: csv_output.writerow(row[1] for row in csv_text) IndexError: list index out of range . Any thoughts? I named the documents in a same way and edited the range, of course.

Basically, I don't care if we can solve this problem with python or not. I'm just done with my nerves so I really appreciate every help


Solution

  • I'm not exactly sure what you are trying to accomplish, but to remove the newlines for textfiles and make one big text file with the results, something among the following should work:

    for i in *.txt; do NEW=` cat $i | tr '\n' ' '` ; echo $NEW  >> output.txt; done