Search code examples
pandassumcalculated-columns

Sum pandas dataframe rows using column starting with same name


I have this DataFrame:

import pandas as pd
d = {'1_col': [1, 2], '2_col': [3, 4], 'var1': [5,6]}
df = pd.DataFrame(data=d)
print(df)

which looks like this:

    1_col  2_col  var1
0     1     3     5
1     2     4     6

I need to sum all the columns whose name ends in _col, so that the resulting DataFrame looks like this (the column called sum is the sum of the columns 1_col and 2_col:

    1_col  2_col  var1 sum
0     1     3     5     4
1     2     4     6     6

Is there a way in pandas to sum all the columns whose name ends with "_col" rather than doing it manually?


Solution

  • Use df.filter with df.sum:

    In [1209]: df['sum'] = df.filter(like='_col').sum(1)
    
    In [1210]: df
    Out[1210]: 
       1_col  2_col  var1  sum
    0      1      3     5    4
    1      2      4     6    6