I have a dataframe as follows:
Group Val1 Val2 Val3 Val4
A -94 96 16 -92
B 30 59 -10 44
C 50 -18 -30 24
D 61 49 -15 -95
I need to find the sum of all positive values in each group except the highest value to get the following:
Group Sum
A 16
B 74
C 24
D 49
for group A, I ignored 96 and only 16 was used to get sum of 16.
for group B, I ignored 59 and 30, 44 was used to get sum of 74
If there is only 1 positive value, we keep them we don't drop any values.
I am not sure how to do the above.
Try this:
df = df.set_index('Group')
df.where(df.ne(df.max(axis=1), axis=0) & (df > 0)).sum(1)
Output:
Group
A 16.0
B 74.0
C 24.0
D 49.0
dtype: float64
Details:
Find max on each row, look for values in that row that are not equal to that max AND values greater than zero then sum with axis=1.
Or..
df.mask(df<0).sum(1) - df.max(1)
Output:
Group
A 16.0
B 74.0
C 24.0
D 49.0
dtype: float64