Search code examples
pythonpandassumpivotgrouping

Sum pivot value by condition python


I have a pandas data frame like:

Id a b c d 
x  1 1 1 01/01/2021
x  1 1 0 01/05/2021
y  1 1 1 02/01/2021
y  1 1 1 02/01/2021

Id a b c d 
x  2 1 1 01/01/2021
y  1 1 1 02/01/2021

I want to sum the values for the column a b and c for the same Id if and only if the date in column d is different. In fact, in these case in the 2 occurrences x has different dates so I can sum the values, instead d has the same values so I just not take the sum but the first occurrence.


Solution

  • dat[~dat[["d", "Id"]].duplicated()].groupby("Id", as_index=False).agg(
        {nm: "first" if nm in ["d", "Id"] else "sum" for nm in dat.columns}
    )
        Id  a   b   c   d
    0   x   2   2   1   01/01/2021
    1   y   1   1   1   02/01/2021