Search code examples
pythonraggregate

Aggregating two values in a panel data with the same date and id but different type


I have been trying to work on this dataset that includes quantities for two types of sales (0,1) for different counties across different dates. Some dates, however, include both type 1 and type 0 sales. How can I merge type 1 and 0 sales for the same date and same id? The dataset has over 40k rows and I have no idea where to start. I was thinking about creating an if loop but I have no idea how to write it. It can be in python or R.

Essentially, I have a table that looks like this:

Date City Type Quantity
2020-01-01 Rio 1 10
2020-01-01 Rio 0 16
2020-03-01 Rio 0 23
2020-03-01 Rio 1 27
2020-05-01 Rio 1 29
2020-08-01 Rio 0 36
2020-01-01 Sao Paulo 0 50
2020-01-01 Sao Paulo 1 62
2020-03-01 Sao Paulo 0 30
2020-04-01 Sao Paulo 1 32
2020-05-01 Sao Paulo 0 65
2020-05-01 Sao Paulo 1 155

I want to combine, for example, Rio's quantities for both type 1 and 0 on 2020-01-01, as well as 2020-03-01, and the same thing for Sao Paulo and all subsequent counties. I want to aggregate types 1 and 0 quantities but still preserve the date and city columns.


Solution

  • Try something like this:

    import pandas as pd
    
    df = pd.read_csv('your_file_name.csv')
    df.pivot_table(values='Sales', index=['Date', 'City'], aggfunc='sum')