I have extracted a pandas dataframe where each row can belong to one of 24 clusters.
date cluster tweet_id id
0 2021-05-09 15:08:48 15 1391409828233351168 0
1 2021-05-09 07:29:08 7 1391294148200837122 1
2 2021-05-09 07:29:05 7 1391294136830005248 2
3 2021-05-09 07:28:02 7 1391293869799743489 3
4 2021-05-09 07:27:10 7 1391293650836017155 4
.. ... ... ... ...
195 2021-05-07 04:08:05 4 1390518778191089666 195
196 2021-05-07 04:07:57 4 1390518742715600898 196
197 2021-05-07 04:07:10 4 1390518546321575936 197
198 2021-05-07 04:06:58 4 1390518497097261058 198
199 2021-05-07 04:06:16 4 1390518318617006083 199
How can I group the data based on the cluster it belongs to and how can I draw a scatter plot where x axis(cluster) = [1,2,3 ...24] and y-axis = id ? I tried the following code but it's wrong:
y = df['id']
x = df['cluster']
df.plot.scatter(x=x, y=y)
plt.show()
It would be great if someone could help me out.
Thank you
try this:
date.plot.scatter(x='cluster', y='id')
plt.show()
The x and y args are names of the columns you wish to plot. You were passing two Pandas Series instead.