I have a piece of code that should be getting the first_team
(first value) of a column grouped by ID and setting it to a dictionary, but what I am seeing is it is only getting the first value with value. Excluding those that are NaN.
Here is a sample dataset
ID date name team first_team
101 05/2012 James NaN NY
101 07/2012 James NY NY
102 06/2013 Adams NC NC
102 05/2014 Adams AL NC
The code I have is:
first_dict = df.groupby('ID').agg({'team':'first'}).to_dict()['team']
df['first_team'] = df['ID'].apply(lambda x: first_dict[x])
Desired output:
ID date name team first_team
101 05/2012 James NaN NaN
101 07/2012 James NY NaN
102 06/2013 Adams NC NC
102 05/2014 Adams AL NC
If you want to keep the first entry, you can do with drop_duplicates
:
first_dict = df.drop_duplicates('ID')[['ID','team']].set_index('ID')['team']
df['first_team'] = df['ID'].map(first_dict)
Output:
ID date name team first_team
0 101 05/2012 James NaN NaN
1 101 07/2012 James NY NaN
2 102 06/2013 Adams NC NC
3 102 05/2014 Adams AL NC
Note: FFR, your code can be better done with transform
,
df['first_team'] = df.groupby('ID')['team'].transform('first')