How to sum all values from rows by string condition?
Original DF:
schema | numb |
---|---|
tbc_hh | 1751732 |
tbc_hh | 1084173 |
tbc_rr | 1751732 |
tbc_rr | 1084173 |
tbc_zz | 1751732 |
tbc_zz | 1084173 |
DF expect to get:
schema | numb |
---|---|
tbc_hh | 1751732 |
tbc_hh | 1084173 |
tbc_hh_total | 2835905 |
tbc_rr | 1751733 |
tbc_rr | 1084174 |
tbc_rr_sum | 2835907 |
tbc_zz | 1751734 |
tbc_zz | 1084175 |
tbc_zz_sum | 2835909 |
You can try
out = (df.groupby('schema', as_index=False)
.apply(lambda g: pd.concat([g,
pd.DataFrame([[g.name+'_total', g['numb'].sum()]],
columns=g.columns)]))
.reset_index(drop=True))
print(out)
schema numb
0 tbc_hh 1751732
1 tbc_hh 1084173
2 tbc_hh_total 2835905
3 tbc_rr 1751732
4 tbc_rr 1084173
5 tbc_rr_total 2835905
6 tbc_zz 1751732
7 tbc_zz 1084173
8 tbc_zz_total 2835905