Whats the python equivalent for sumifs to solve the given problem

this is the dataframe

import pandas as pd
df = pd.DataFrame({'IDENTIFIER': ['A_xcxcxc', 'BA_bcbcbc', 'A_xcxcxc', 'A_xcxcxc', 'BA_bcbcbc', 'C_rgrg', 'BA_bcbcbc', 'D_wewerw', 'A_xcxcxc', 'A_xcxcxc'],
                   'income': [-30362100.0, 200000.0, -21248077.5, 150000.0, -33843389.2, 200000.0, -40229279.75, 250000.0, -22111384.6, 200000.0],
'Date' : ['03/03/2031', '22/01/2060', '04/03/2025', '22/07/2032', '08/03/2028', '22/11/2065', '05/04/2024', '22/03/2032', '15/10/2025', '22/07/2065']
})

I want to aggregate income for each of the identifier, but only if it falls before 1/1/2030. just to clarify if I do so in excel using sumifs I get this

I am assuming it can be done using groupby fuction, but not sure how to add the conditionality related to the date.

Solution

If you want the non-matching keys:

>>> (df.groupby('IDENTIFIER')
     .apply(lambda x:x.loc[
                      pd.to_datetime(x.Date).lt('2030-01-01'), 
                      'income'
                  ].sum(min_count=1))
     .fillna('-'))

IDENTIFIER
A_xcxcxc    -43359462.10
BA_bcbcbc   -74072668.95
C_rgrg                 -
D_wewerw               -

Without using apply:

>>> ( df['income']
        .where(pd.to_datetime(df.Date).lt('2030-01-01'))
        .groupby(df['IDENTIFIER']).sum(min_count=1).fillna('-') )

IDENTIFIER
A_xcxcxc    -43359462.10
BA_bcbcbc   -74072668.95
C_rgrg                 -
D_wewerw               -
Name: income, dtype: object

NOTE: If you want np.nan instead of -, remove the fillna('-') at the end.

Otherwise, if you only want the matching groups:

>>> df.groupby(df.loc[
         pd.to_datetime(df.Date).lt('2030-01-01'), 
         'IDENTIFIER'
    ])['income'].sum()

IDENTIFIER
A_xcxcxc    -43359462.10
BA_bcbcbc   -74072668.95
Name: income, dtype: float64