Search code examples
pandasdataframepandas-groupby

Add default value while groupby in pandas


I am working with Pandas dataframe and my data has Date, Events and Count as columns.

Date        Events     Count
1/1/2021    Event1      2
1/1/2021    Event2      1
2/1/2021    Event1      2
1/2/2021    Event1      1
2/1/2021    Event2      1
3/2/2021    Event3      2
3/2/2021    Event3      2

I want to groupby of events if it repeats in the same month and give default value as 1 in count column.

Date        Events  Count
1/1/2021    Event1    1
1/1/2021    Event2    1
1/2/2021    Event1    1
3/2/2021    Event3    1

Solution

  • Use DataFrame.drop_duplicates with set 1 to Count column:

    df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
    
    df['months'] = df['Date'].dt.to_period('m')
    
    
    df1 = df.drop_duplicates(['months','Events']).assign(Count=1)
    print (df1)
            Date  Events  Count   months
    0 2021-01-01  Event1      1  2021-01
    1 2021-01-01  Event2      1  2021-01
    3 2021-02-01  Event1      1  2021-02
    5 2021-02-03  Event3      1  2021-02
    
    df1 = df1.drop('months', axis=1)
    print (df1)
            Date  Events  Count
    0 2021-01-01  Event1      1
    1 2021-01-01  Event2      1
    3 2021-02-01  Event1      1
    5 2021-02-03  Event3      1