Search code examples
pythonpandaspandas-groupbypython-datetime

How can I group by category and sum total sales by month - pandas


I have the following DataFrame:

test = {'Date': ['2021-01-01', '2021-01-15', '2021-01-02', '2021-01-14', '2021-01-05', '2021-01-07'],
        'Category': ['Fixed', 'Fixed', 'Mindful Spending', 'Mindful Spending', 'Subscription', 'Subscription'],
       'Amount': ['5', '5', '5', '5', '5', '5']}
example = pd.DataFrame(test)
example

enter image description here

My Date column data type is datetime64[ns], transformed using pd.to_datetime. I want to groupby Category and sum total amount spent by Date month. The result that I want is something like this:

test = {'Date': ['2021-01', '2021-01', '2021-01'], 
        'Category': ['Fixed', 'Mindful Spending', 'Subscription'],
       'Amount': ['10', '10', '10']}
result = pd.DataFrame(test)
result

enter image description here

How can I do this?

Thank you for all the help!


Solution

  • You can use pd.Grouper to define months to group by. Note that for this purpose the Date column needs to be your index.

    test = {'Date': ['2021-01-01', '2021-01-15', '2021-01-02', '2021-01-14', '2021-01-05', '2021-01-07'],
            'Category': ['Fixed', 'Fixed', 'Mindful Spending', 'Mindful Spending', 'Subscription', 'Subscription'],
           'Amount': [5, 5, 5, 5, 5, 5]}
    example = pd.DataFrame(test)
    example.Date = pd.DatetimeIndex(example.Date)
    
    example.set_index("Date").groupby([pd.Grouper(freq="M"), "Category"]).sum().reset_index()
    
    
    Out[2]: 
            Date          Category  Amount
    0 2021-01-31             Fixed      10
    1 2021-01-31  Mindful Spending      10
    2 2021-01-31      Subscription      10