python python-3.x pandas dataframe group-by

Getting first row by ID in Python

I have a piece of code that should be getting the first_team (first value) of a column grouped by ID and setting it to a dictionary, but what I am seeing is it is only getting the first value with value. Excluding those that are NaN.

Here is a sample dataset

 ID     date           name       team       first_team
 101   05/2012         James      NaN            NY
 101   07/2012         James      NY             NY
 102   06/2013         Adams      NC             NC
 102   05/2014         Adams      AL             NC

The code I have is:

first_dict = df.groupby('ID').agg({'team':'first'}).to_dict()['team']
df['first_team'] = df['ID'].apply(lambda x: first_dict[x])

Desired output:

  ID      date        name      team         first_team 
  101     05/2012     James      NaN           NaN 
  101     07/2012     James      NY            NaN 
  102     06/2013     Adams      NC            NC 
  102     05/2014     Adams      AL            NC

Solution

If you want to keep the first entry, you can do with drop_duplicates:

first_dict = df.drop_duplicates('ID')[['ID','team']].set_index('ID')['team']
df['first_team'] = df['ID'].map(first_dict)

Output:

    ID     date   name team first_team
0  101  05/2012  James  NaN        NaN
1  101  07/2012  James   NY        NaN
2  102  06/2013  Adams   NC         NC
3  102  05/2014  Adams   AL         NC

Note: FFR, your code can be better done with transform,

df['first_team'] = df.groupby('ID')['team'].transform('first')