Search code examples
pandasbar-charthistogram

Bar Chart of Total Counts for Unique Values in Pandas Dataframe


I want to find the total number of counts for each category. I generated an example using colors as the categories. I solved the problem how I wanted the solution to look; however, I feel like there should be a built-in command in a package which can do the same thing. My approach is slow and won't scale up. I don't mind optimizing this approach but would be best to look for a canned function.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np


# Sum all values for each unique category.
def combine_like_categories(df):
    col = df['color']
    new_colors = col.unique()
    new_values = np.zeros_like(new_colors)
    new_df = pd.DataFrame(np.array([new_colors, new_values]).T)
    headers=['color', 'value']
    new_df.columns = headers

    for _, row in df.iterrows():
        new_df_index = new_df.loc[new_df['color']==row['color']].index[0]
        new_df.iloc[new_df_index, 1] += row['value']
    return new_df


data = [
    ['red', 3],
    ['blue', 2],
    ['green', 5],
    ['orange', 3],
    ['blue', 1],
    ['red', 7]
]

headers=['color', 'value']
df = pd.DataFrame(data)
df.columns = headers
hist_df = combine_like_categories(df)

plt.figure()
plt.bar(hist_df['color'], hist_df['value'])
plt.title('Counts of Each Color')
plt.xlabel('Color')
plt.ylabel('Counts')
plt.show()

enter image description here


Solution

  • Using groupby:

    ax = df.groupby('color')['value'].sum().plot.bar(rot=0)
    ax.set_title('Counts of Each Color')
    ax.set_ylabel('Counts')
    

    Output:

    enter image description here