Search code examples
pythontext-miningtf-idftextblob

Exporting relevant words TF-IDF TextBlob python


I followed this tutorial to search the relevant words in my documents. My code:

>>> for i, blob in enumerate(bloblist):
print i+1
scores = {word: tfidf(word, blob, bloblist) for word in blob.words}
sorted_words = sorted(scores.items(), key=lambda x: x[1], reverse=True)
for word, score in sorted_words[:10]:
    print("\t{}, score {}".format(word, round(score, 5)))

1
 k555ld-xx1014h, score 0.19706
 fuera, score 0.03111
 dentro, score 0.01258
 i5, score 0.0051
 1tb, score 0.00438
 sorprende, score 0.00358
 8gb, score 0.0031
 asus, score 0.00228
 ordenador, score 0.00171
 duro, score 0.00157 
2
 frentes, score 0.07007
 write, score 0.05733
 acceleration, score 0.05255
 aprovechando, score 0.05255
 . . . 

Here's my problem, I would like to export a data frame with the following information: index, 10 top words (separated with commas). Something that i can save with pandas dataframe. Example:

TOPWORDS = pd.DataFrame(topwords.items(), columns=['ID', 'TAGS'])

Thank you all in advance.


Solution

  • Solved!

    Here's my solution, perhaps not the best but it works.

    tags = {}
    for i, blob in enumerate(bloblist):
          scores = {word: tfidf(word, blob, bloblist) for word in blob.words}
          sorted_words = sorted(scores.items(), key=lambda x: x[1], reverse=True)
          a =""
          for word, score in sorted_words[:10]:
               a= a + ' '+ word
          tags[i+1] = a