I tried to exclude them with a " ' " but that failed. Not sure where they are pulling from as they are not in the document. Thanks for any help
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
url = 'https://raw.githubusercontent.com/Imme21/WordCloud/main/StockData3.csv'
df = pd.read_csv(url, error_bad_lines=False)
df.dropna(inplace = True)
text = df['Stock'].values
wordcloud = WordCloud(background_color = 'white',
stopwords = ['Date','Stock', 'Tickers',
'Open','Close', 'High',
'Low', 'IV', 'under',
'over', 'price', 'change',
'%', 'null']).generate(str(text))
plt.imshow(wordcloud)
plt.axis("off")
plt.show()
The problem is related to how you obtain the string from the values in the dataframe column. Specifically, text = df['Stock'].values
and .generate(str(text)
.
Using pandas.Series.str.cat
will produce the "correct" string and will give you the desired outcome:
...
>>> text = df['Stock'].str.cat(sep=' ')
...
>>> wordcloud = WordCloud(background_color = 'white',
stopwords = ['Date','Stock', 'Tickers',
'Open','Close', 'High',
'Low', 'IV', 'under',
'over', 'price', 'change',
'%', 'null']).generate(text)
...