Search code examples
pythonpandasaggregate

Counting Specific Values by Month


I have some data I want to count by month. The column I want count has three different possible values, each representing a different car sold. Here is an example of my dataframe:

Date                        Type_Car_Sold
2015-01-01 00:00:00                2
2015-01-01 00:00:00                1
2015-01-01 00:00:00                1
2015-01-01 00:00:00                3
...                               ...

I want to make it so I have a dataframe that counts each specific car type sold by month separately, so looking like this:

Month       Car_Type_1        Car_Type_2        Car_Type_3         Total_Cars_Sold
1               15                12                17                     44
2                9                18                20                     47
...             ...              ...                ...                    ...

How exactly would I go about doing this? I've tried doing:

cars_sold = car_data['Type_Car_Sold'].groupby(car_data.Date.dt.month).agg('count')

but that just sums up all the cars sold in the month, rather than breaking it down by the total amount of each type sold. Any thoughts?


Solution

  • Maybe not the cleanest solution, but this should get you pretty close

    import pandas as pd
    from datetime import datetime
    
    df = pd.DataFrame({
        "Date": [datetime(2022,1,1), datetime(2022,1,1), datetime(2022,2,1), datetime(2022,2,1)],
        "Type": [1, 2, 1, 1],
    })
    
    df['Date'] = df["Date"].dt.to_period('M')
    df['Value'] = 1
    
    print(pd.pivot_table(df, values='Value', index=['Date'], columns=['Type'], aggfunc='count'))
    
    Type       1    2
    Date             
    2022-01  1.0  1.0
    2022-02  2.0  NaN