I have a table like this
src | dst | amount
------------------
a | c | 100
b | c | 50
a | b | 40
a | nan | 10
nan | nan | 100
and i want a table like this, where I group by src and sum the amount sent, and if the src is nan is want to replace with "unknown".
entity | sent
-------------
a | 150
b | 50
uknown | 100
If nan
is missing value replace it before aggregation:
df = (df.fillna({'src': 'unknown'})
.groupby('src')['amount']
.sum()
.rename_axis('entity')
.reset_index(name='sent'))
print (df)
entity sent
0 a 150
1 b 50
2 unknown 100
Or if 'nan'
is string use:
df = (df.replace({'src': {'nan': 'unknown'}})
.groupby('src')['amount']
.sum()
.rename_axis('entity')
.reset_index(name='sent'))
print (df)
entity sent
0 a 150
1 b 50
2 unknown 100
Another ideas:
df.groupby(df['src'].fillna('unknown').rename('entity'))['sent'].sum().reset_index()
df.groupby(df['src'].replace('nan','unknown').rename('entity'))['sent'].sum().reset_index()