I am aware similar questions have been asked before (How to merge two rows in a dataframe pandas, etc), but I am still struggling to do the following (except with pandas dataframe with many rows):
team_token day1 day2 day3 day4
0 abc 1 NaN NaN NaN
1 abc NaN 1 NaN NaN
2 abc NaN NaN NaN NaN
3 abc NaN NaN NaN 1
I want to combine the rows with the same team_token so that the end result looks like:
team_token day1 day2 day3 day4
0 abc 1 1 NaN 1
Thank you in advance.
Use combine_first
on two Series
given by the two rows of you dataframe:
import pandas as pd
df = pd.DataFrame({'team_token':['abc', 'abc'], 'day1': [1, None], 'day2' : [None, 1]})
df.loc[0].combine_first(df.loc[1])
gives:
team_token day1 day2
0 abc 1 1
#EDIT: A better solution that works also with your updated answer is simply:
df.max()
assuming that your day
columns contain either Nan or ones.
In case you have different team_tokens
:
day1 day2 day3 team_token
0 1.0 NaN NaN abc
1 NaN NaN NaN abc
2 1.0 1.0 NaN abc
3 NaN NaN NaN abc
4 1.0 NaN 1.0 ddd
5 1.0 NaN 1.0 ddd
6 NaN NaN 1.0 ddd
you can group_by
and still take the max
of the group:
df.groupby('team_token').max().reset_index()
# team_token day1 day2 day3
#0 abc 1.0 1.0 NaN
#1 ddd 1.0 NaN 1.0