I have a dataset:
game_id year
100 2020
100 2020
100 2020
100 2020
227 2022
227 2022
228 2023
228 2023
228 2023
...
300 2023
300 2023
301 2023
301 2023
301 2023
And I'd like to generate one histogram per year
of the distribution of unique game_id
values (so df['game_id'].value_counts()
) using pandas 2.0.3.
I can manually do this using e.g. years = df'groupby('year')
and then working with each year using years.get_group(2023).value_counts().hist()
, but I feel like there should be a simple one-liner to pass the data to hist()
in the correct shape to get a small multiples plot.
Assuming you want a histogram of the counts:
pd.crosstab(df['game_id'], df['year']).plot.hist(alpha=0.5)
Output:
For separate graphs, you can use seaborn.displot
:
import seaborn as sns
sns.displot(data=df.value_counts().reset_index(name='count'),
x='count', col='year', kind='hist')
Output: