Search code examples
pythonpandasmatplotlibpandas-groupbyfrequency-distribution

Grouping by week and ID, averaging, Grouping by week again and plotting


I have a panda Data-Frame of tweets called "labelled_data" which includes 'tweep_username', 'tweetcreated_at'(which is time) and 'label'

I wanna group them by 'tweep_username' and 'tweetcreated_at'(by week) and then take mean of 'labels'. Then I wanna take these obtained means and just group them by 'tweetcreated_at'(by week) and then plot a continuous frequency distribution out of them.

Meaning that I want to have separate frequency distributions on means of 'labels' obtained in the first part, for each week

I have tried this code:

labelled_data['tweetcreated_at'] = pd.to_datetime(labelled_data['tweetcreated_at'], errors='coerce')
s=labelled_data.groupby(['tweep_username',pd.Grouper(key='tweetcreated_at', freq='W')])['label'].mean()..set_index('tweetcreated_at').resample('W')

plt.hist(s)
plt.show()

and received the following error:

'Series' object has no attribute 'toordinal'

this photo shows the data


Solution

  • I solved it this way:

    labelled_data.groupby(['tweep_username',pd.Grouper(key='tweetcreated_at', freq='W')])['label'].mean().reset_index().groupby('tweetcreated_at')['label'].plot(kind='density', legend=True)