I am trying to create a histogram based on frequent/common words, but I only get errors when running the code. I managed to find the 10 most common words, but I can't visualize it in a histogram.
description_list = df['description'].values.tolist()
from collections import Counter
Counter(" ".join(description_list).split()).most_common(10)
#histogram
plt.bar(x, y)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show
It looks like this missed a few things:
Counter(...).most_common(10)
was not assigned to x
or y
x
, y
appear to be unboundplt.show
was not invoked, so it either does nothing or prints something like <function show at 0x...>
Here's a reproducible example that fixes these:
from collections import Counter
import matplotlib.pyplot as plt
import pandas as pd
data = {
"description": [
"This is the first example",
"This is the second example",
"This is similar to the first two",
"This exists add more words"
]
}
df = pd.DataFrame(data)
description_list = df['description'].values.tolist()
# Assign the Counter instance `most_common` call to a variable:
word_frequency = Counter(" ".join(description_list).split()).most_common(10)
# `most_common` returns a list of (word, count) tuples
words = [word for word, _ in word_frequency]
counts = [counts for _, counts in word_frequency]
plt.bar(words, counts)
plt.title("10 most frequent tokens in description")
plt.ylabel("Frequency")
plt.xlabel("Words")
plt.show()
With expected output: