python pandas aggregate aggregate-functions

groupby with nth() and agg

I have a dataframe and would like to groupby ìndex1 and get new columns for the first, 2nd, 3rd, fourth and last values of column A (if they exist). So I would like to use agg so I can aggregate all at once. But somehow that does not work with nth. What am I doing wrong?

df = pd.DataFrame({'index1':[1,2,2,2,3,3], 'A':[1,1,2,3,1,4]})
df.groupby(['index1']).agg({'A': ['first',
                                 lambda x: x.nth(1),
                                 lambda x: x.nth(2),
                                 lambda x: x.nth(3),
                                 'last']
}).reset_index()

so given this dataframe

  index1  A
0       1     1
1       2     1
2       2     2
3       2     3
4       3     1
5       3     4

I would like to get is this

   index1  first second thrid fourth  last
0       1      1                         1
1       2      1      2     3            3
2       3      1      4                  4

Solution

You can combine pivot and groupby.last:

g = df.groupby('index1')

(df.assign(col=g.cumcount().add(1))
   .pivot(index='index1', columns='col', values='A')
   .assign(last=g.last())
   .reset_index()
)

output:

col  index1    1    2    3  last
0         1  1.0  NaN  NaN     1
1         2  1.0  2.0  3.0     3
2         3  1.0  4.0  NaN     4

For named columns:

names = {1: 'first', 2: 'second', 3: 'third', 4: 'fourth'}

g = df.groupby('index1')

out = (df
   .assign(col=g.cumcount().add(1))
   .pivot(index='index1', columns='col', values='A')
   .reindex(columns=list(names)).rename(columns=names)
   .assign(last=g.last())
   .reset_index()
)

output:

col  index1  first  second  third  fourth  last
0         1      1     NaN    NaN     NaN     1
1         2      1     2.0    3.0     NaN     3
2         3      1     4.0    NaN     NaN     4