Given a dataframe like this:
df = pd.DataFrame({
'player': ['James', 'Henry', 'Abe', 'James', 'Henry', 'Brian', 'Claude', 'James'],
'team': ['A', 'B', 'A', 'B', 'A', 'A', 'A', 'C'],
'game': [101, 102, 103, 101, 102, 104, 105, 101],
'score': [10, 11, 12, 10, 11, 10, 12, 10]
})
I can create a pivot_table that summarizes by player and team, with game as the column and summing the score:
pd.pivot_table(df, index = ['player', 'team'], columns = 'game', values = 'score', aggfunc = 'sum', fill_value = '')
This returns a table like this:
What I want to do, though, is to count the number of unique teams each player played on, so that Abe, Brian and Claude all have 1, Henry has 2, and James has 3.
I've tried nunique and count_values but they seem only to work with numbers, not strings. I've seen questions that address the problem of counting unique values in the 'values' part of the pivot table, but not in the index part.
Edit -- here is the desired output, note that the game scores are still summed:
Update my answer
Code
out = (df
.assign(unique_teams=df.groupby('player')['team'].transform('nunique'))
.pivot_table('score', index = ['player', 'unique_teams'],
columns = 'game', aggfunc = 'sum', fill_value = '')
.reset_index().rename_axis('', axis=1)
)
out
player unique_teams 101 102 103 104 105
0 Abe 1 12.0
1 Brian 1 10.0
2 Claude 1 12.0
3 Henry 2 22.0
4 James 3 30.0