Search code examples
pythonpandaspivot-table

How do I count unique values in the index of a pandas pivot_table?


Given a dataframe like this:

df = pd.DataFrame({
    'player': ['James', 'Henry', 'Abe', 'James', 'Henry', 'Brian', 'Claude', 'James'],     
    'team': ['A', 'B', 'A', 'B', 'A', 'A', 'A', 'C'],
    'game': [101, 102, 103, 101, 102, 104, 105, 101],
    'score': [10, 11, 12, 10, 11, 10, 12, 10]
})

I can create a pivot_table that summarizes by player and team, with game as the column and summing the score:

pd.pivot_table(df, index = ['player', 'team'], columns = 'game', values = 'score', aggfunc = 'sum', fill_value = '')

This returns a table like this:

pivot_table output

What I want to do, though, is to count the number of unique teams each player played on, so that Abe, Brian and Claude all have 1, Henry has 2, and James has 3.

I've tried nunique and count_values but they seem only to work with numbers, not strings. I've seen questions that address the problem of counting unique values in the 'values' part of the pivot table, but not in the index part.

Edit -- here is the desired output, note that the game scores are still summed:

enter image description here


Solution

  • Update my answer

    Code

    out = (df
           .assign(unique_teams=df.groupby('player')['team'].transform('nunique'))
           .pivot_table('score', index = ['player', 'unique_teams'], 
                        columns = 'game', aggfunc = 'sum', fill_value = '')
           .reset_index().rename_axis('', axis=1)
    )
    

    out

       player  unique_teams   101   102   103   104   105
    0     Abe             1              12.0            
    1   Brian             1                    10.0      
    2  Claude             1                          12.0
    3   Henry             2        22.0                  
    4   James             3  30.0