Search code examples
pythonpandasdataframedefaultdict

python word count(defaultdict) column not showing


import pandas as pd
from collections import defaultdict
word_name = []
y = 0

text_list = ['france', 'spain', 'spain beaches', 'france beaches', 'spain best beaches']

word_freq = defaultdict(int)

for text in text_list:
    for word in text.split():
        word_freq[word] += 1
        word_name.append(word)


df = pd.DataFrame.from_dict(word_freq, orient='index') \
.sort_values(0, ascending=False) \
.rename(columns={0: 'Word_freq'}) \
.rename(columns={0: 'Word'})

so I tried multiple ways to convert this into dataframe but it does not show the column name for the words. How am i able to indicate it ?


Solution

  • Do you know of the Counter class from the collections library? you can simply your code quite a bit by using that in-place of default dict.

    from collections import Counter
    
    
    text_list = ['france', 'spain', 'spain beaches', 'france beaches', 'spain best beaches']
    
    counter_dict = Counter([split_word for word in text_list for split_word in word.split()]
    #Counter({'france': 2, 'spain': 3, 'beaches': 3, 'best': 1})
    

    then construct your dataframe with the to_dict appendage.

    df = pd.DataFrame.from_dict(counter_dict
        ,
        orient="index",
        columns=["WordFreq"],
    ).rename_axis('Word')
    
             WordFreq
    Word             
    france          2
    spain           3
    beaches         3
    best            1