Iam looking to plot Wordcloud using a column in my pandas dataframe
here is my code:
all_words=''.join( [tweet for tweet in tweet_table['tokens'] ] )
word_Cloud=WordCloud(width=500, height=300, random_state=21, max_font_size=119).generate(all_words)
plt.imshow(word_Cloud, interpolation='bilinear')
The column tweet_table['tokens']
that iam looking to plot looks like this:
0 [da, trumpanzee, follower, blm, balance, wp, g...
1 [counting, blacklivesmatter, received, trainin...
2 [okay, like, little, kids, pretty, smart, know...
3 [thank, oscopelabs, got, mounted, loud, amp, p...
4 [bpi, proud, supported, hoops, 4l, f, e, see, ...
...
44713 [tomorrow, buy, charity, compilation, undergro...
44714 [needs, erected, state, capitol, think, darkfa...
44715 [clay, county, sheriffs, motto, screw, amp, in...
44716 [films, eleven, films, bravo, bad, ass, video,...
44717 [everybody, give, listen, blm]
My code above gives me the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-227-4066d6d1a153> in <module>
2 # REMOVE STOP WORDS
3
----> 4 all_words=''.join( [tweet for tweet in tweet_table['tokens'] ] )
TypeError: sequence item 0: expected str instance, list found
How can i fix the error please? The column tweet_table['token']
is tokenized
and clean from any stopwords
Many Thanks
Ps: when i use similar code for this column tweet_table['clean_text']
the code works fine.
The column tweet_table['clean_text']
looks like this:
0 You have a da trumpanzee follower in ...
1 Over 279 and counting If BlackLivesMatte...
2 Okay but like little kids are pretty smart and...
3 Thank you oscopelabs got it mounted loud amp...
4 BPI is proud to have supported Hoops4L Y F E ...
...
44713 TOMORROW you can buy the charity compilation...
44714 That needs to be erected at the State Capi...
44715 Clay County Sheriffs Motto To Screw amp ...
44716 Films Eleven Films bravo Bad ass vid...
44717 everybody should give this a listen ...
I just got it fixed
allwords=''.join( str(tweet_table['tokens']))
word_Cloud=WordCloud(width=500, height=300, random_state=21,
max_font_size=119).generate(allwords)
plt.imshow(word_Cloud, interpolation='bilinear')
where tweet_table['tokens']
is free from any stopwords. Otherwise, we create a list of stopwords and add it as the code below
from wordcloud import WordCloud,STOPWORDS
stopwords_newlist = ["https", "co"] + list(STOPWORDS)
allwords=''.join( str(tweet_table['tokens']))
word_Cloud=WordCloud(width=500, height=300, random_state=21, stopwords=stopwords_newlist,
max_font_size=119).generate(allwords)
plt.imshow(word_Cloud, interpolation='bilinear')