In my dataframe I want to sum certain rows in a column and output them in a new column 'UE_more_days'
is
ATEXT BEGUZ_UE UE_more_days
0 11.00 0.0
1 CT 23.00 Nan
2 RT 33.00 46.0
3 15.00 0.0
3 15.00 0.0
4 12.75 0.0
5 19.75 0.0
6 14.75 0.0
7 CT 23.00 29.5
8 CT 24.00 46.0
9 CT 24.00 48.0
10 RT 33.00 48.0
11 15.00 0.0
12
etc
should be
ATEXT BEGUZ_UE UE_more_days
0
1 CT 23.00
2 RT 33.00 56.0
3 15.00
4 12.75
5 19.75
6 14.75
7 CT 23.00
8 CT 24.00
9 CT 24.00
10 RT 33.00 104.0
11 15.00
12
etc
should be 2
ATEXT BEGUZ_UE subtract add UE_more_days is_m_days
0 11.00 *0.00* *3.92*
1 CT *23.00* 0.00 0.00
2 RT *33.00* 0.00 0.00 56.0
3 *15.00* 0.20 0.00 *74.92*
4 12.75
5 19.75
6 14.75 *2.00* *0.00*
7 CT *23.00*
8 CT *24.00*
9 CT *24.00*
10 TT *33.00* 104.0
11 *15.00* 0.00 3.57 *117.00*
12
etc
my last try
bedd2 = [(df['ATEXT'] != ''),]
result2 = [(df.iloc[0:]['BEGUZ_UE'].astype(float).reset_index(drop=True) +
df.iloc[1:]['BEGUZ_UE'].astype(float)).round(decimals=2).shift(1)]
df['min_UE_mehr_Tage'] = np.select(bedd2, result2)
How can I sum rows from a column based on a condition and output them in a new column?
Replace with Nan the values in BEGUZ_UE where ATEXT is equal to "", and after that create groups based on the same condition.
m = df['ATEXT'].eq("")
cond = (~m) & m.shift(-1)
df['UE_more_days'] = (df['BEGUZ_UE'].mask(m)
.groupby(m.cumsum()).cumsum()
.where(cond)
)
tmv = (df[['subtract', 'add']]
.shift()
.groupby(m.cumsum())
.transform('max')
.eval('add-subtract')
)
df['m_days'] = (df.groupby(m[::-1].cumsum())['BEGUZ_UE']
.transform('sum')
.add(tmv)
.where(cond)
.shift()
)
End result:
ATEXT BEGUZ_UE subtract add UE_more_days m_days
0 11.00 0.0 3.92 NaN NaN
1 CT 23.00 0.0 0.00 NaN NaN
2 RT 33.00 0.0 0.00 56.0 NaN
3 15.00 0.2 0.00 NaN 74.92
4 12.75 NaN NaN NaN NaN
5 19.75 NaN NaN NaN NaN
6 14.75 2.0 0.00 NaN NaN
7 CT 23.00 NaN NaN NaN NaN
8 CT 24.00 NaN NaN NaN NaN
9 CT 24.00 NaN NaN NaN NaN
10 TT 33.00 NaN NaN 104.0 NaN
11 15.00 0.0 3.57 NaN 117.00
Dataset I used:
data = {'ATEXT': ['', 'CT', 'RT', '', '', '', '', 'CT', 'CT', 'CT', 'TT', ''],
'BEGUZ_UE': [11.0, 23.0, 33.0, 15.0, 12.75, 19.75, 14.75, 23.0,
24.0, 24.0, 33.0, 15.0],
'subtract': [0.0, 0.0, 0.0, 0.2, np.nan, np.nan, 2.0, np.nan,
np.nan, np.nan, np.nan, 0.0],
'add': [3.92, 0.0, 0.0, 0.0, np.nan, np.nan, 0.0, np.nan, np.nan,
np.nan, np.nan, 3.57],
'UE_more_days': [np.nan, np.nan, 56.0, np.nan, np.nan, np.nan, np.nan,
np.nan, np.nan, np.nan, 104.0, np.nan]}