I have a dataframe and would like to add sums of specific rows into this dataframe. So for example I have
df = pd.DataFrame({'prod':['a','a','a','b','b','b','c','c','c'], 'attribute':['x','y','z','x','y','z','x','y','z'],
'number1':[1,2,2,3,4,3,5,1,1], 'number2':[10,2,3,3,1,2,3,1,1], 'number3':[1,4,3,5,7,1,3,0,1]})
How can I add for each prod a, b and c the sum of number 1/2/3 of the attributes y and z as a new row? So it looks like this
prod attribute number1 number2 number3
0 a x 1 10 1
1 a y 2 2 4
2 a z 2 3 3
3 a sum_yz 4 5 7
4 b x 3 3 5
5 b y 4 1 7
6 b z 3 2 1
7 b sum_yz 7 3 8
8 c x 5 3 3
9 c y 1 1 0
10 c z 1 1 1
11 c sum_yz 2 2 1
You could make a separate DataFrane and append it back to the original DataFrame, something like this (this code is untested):
# Filter to the desired attributes
sum_yz = df[df['attribute'].isin(['y', 'z'])]
# Set the new 'attribute' value
sum_yz['attribute'] = 'sum_yz'
# Group by and sum
sum_yz = sum_yz.groupby(['prod', 'attribute']).sum().reset_index()
# Add it the end of the data frame
df = pd.concat([df, sum_yz])