Update: I mixed up the axis. Thats why my question seems weired. This is the new input data:
give following data:
import pandas as pd
data = {'Org': ['Tom', 'Kelly', 'Rick', 'Dave','Sara','Liz'],
'A': ['NaN', 1, 1, 1, 'NaN', 'NaN'],
'B': [1, 1, 1, 1, 'NaN', 1],
'C': [1, 1, 1, 1, 1, 1],
'D': ['NaN', 'NaN', 1, 'NaN', 1, 'NaN'],
'E': [1, 1, 1, 1, 'NaN', 1],
'F': ['NaN', 1, 1, 1, 'NaN', 1]}
df = pd.DataFrame(data)
I want to sum the columns except the first two and then replace the values not NaN with the sum the column results:
the result should like this:
data = {'Org': ['Tom', 'Kelly', 'Rick', 'Dave','Sara','Liz'],
'A': ['NaN', 1, 1, 1, 'NaN', 'NaN'],
'B': [5, 5, 5, 5, 'NaN', 5],
'C': [6, 6, 6, 6, 6, 6],
'D': ['NaN', 'NaN', 2, 'NaN', 2, 'NaN'],
'E': [5, 5, 5, 5, 'Nan', 5],
'F': ['NaN',4, 4, 4, 'NaN', 4]}
df = pd.DataFrame(data)
I tried:
column_sums = df.iloc[:, 2:].sum()
for column in iloc[:, 2:].columns:
df[column] = column_sums[column]
But that replaces me all values.
Is there a smooth solution for that possible?
Thanks
You can consider broadcasting a multiplication in numpy.
import pandas as pd
import numpy as np
data = {'Org': ['Tom', 'Kelly', 'Rick', 'Dave','Sara','Liz'],
'A': ['NaN', 1, 1, 1, 'NaN', 'NaN'],
'B': [1, 1, 1, 1, 'NaN', 1],
'C': [1, 1, 1, 1, 1, 1],
'D': ['NaN', 'NaN', 1, 'NaN', 1, 'NaN'],
'E': [1, 1, 1, 1, 'NaN', 1],
'F': ['NaN', 1, 1, 1, 'NaN', 1]}
df = pd.DataFrame(data)
df = df.replace('NaN', np.nan)
Here we want NaN values to be NaN that's why we replace 0 with np.nan
m = df[df.columns[2:]].notnull()\
.astype(float).replace(0, np.nan)
print(m)
B C D E F
0 1.0 1.0 NaN 1.0 NaN
1 1.0 1.0 NaN 1.0 1.0
2 1.0 1.0 1.0 1.0 1.0
3 1.0 1.0 NaN 1.0 1.0
4 NaN 1.0 1.0 NaN NaN
5 1.0 1.0 NaN 1.0 1.0
s = df[df.columns[2:]].sum()
print(s)
B 5.0
C 6.0
D 2.0
E 5.0
F 4.0
dtype: float64
df[df.columns[2:]] = m * s
print(df)
Org A B C D E F
0 Tom NaN 5.0 6.0 NaN 5.0 NaN
1 Kelly 1.0 5.0 6.0 NaN 5.0 4.0
2 Rick 1.0 5.0 6.0 2.0 5.0 4.0
3 Dave 1.0 5.0 6.0 NaN 5.0 4.0
4 Sara NaN NaN 6.0 2.0 NaN NaN
5 Liz NaN 5.0 6.0 NaN 5.0 4.0