import pandas as pd
from collections import defaultdict
word_name = []
y = 0
text_list = ['france', 'spain', 'spain beaches', 'france beaches', 'spain best beaches']
word_freq = defaultdict(int)
for text in text_list:
for word in text.split():
word_freq[word] += 1
word_name.append(word)
df = pd.DataFrame.from_dict(word_freq, orient='index') \
.sort_values(0, ascending=False) \
.rename(columns={0: 'Word_freq'}) \
.rename(columns={0: 'Word'})
so I tried multiple ways to convert this into dataframe but it does not show the column name for the words. How am i able to indicate it ?
Do you know of the Counter class from the collections library? you can simply your code quite a bit by using that in-place of default dict.
from collections import Counter
text_list = ['france', 'spain', 'spain beaches', 'france beaches', 'spain best beaches']
counter_dict = Counter([split_word for word in text_list for split_word in word.split()]
#Counter({'france': 2, 'spain': 3, 'beaches': 3, 'best': 1})
then construct your dataframe with the to_dict
appendage.
df = pd.DataFrame.from_dict(counter_dict
,
orient="index",
columns=["WordFreq"],
).rename_axis('Word')
WordFreq
Word
france 2
spain 3
beaches 3
best 1