Counting number of unique values in groups

I have data where for multiple years, observations i are categorized in cat. An observation i can be in multiple categories in any year, but is unique across years. I am trying to count unique values for i by year, by cat, and by year and cat.

I'm learning Python (v3.12) & Pandas (v2.2.1). I can make this work, but only by creating separate tables for the counts, and merging them back in with the main data. See the example below. I suspect there is a better way to do this. Is there, and, if so, how?

import pandas as pd
df = pd.DataFrame(
        {'year': [2020,2020,2020,2021,2021,2022,2023,2023,2023,2023],
        'cat': [1,1,2,2,3,3,1,2,3,4],
        'i': ['a','a','b','c','d','e','f','f','g','g']
        })
df

df_cat = df.groupby('cat')['i'].nunique()
df_year = df.groupby('year')['i'].nunique()
df_catyear = df.groupby(['cat', 'year'])['i'].nunique()

df_merged = df.merge(df_cat, how='left', on='cat').rename(columns={'i_x': 'i', 'i_y': 'n_by_cat'})
df_merged = df_merged.merge(df_year, how='left', on='year').rename(columns={'i_x': 'i', 'i_y': 'n_by_year'})
df_merged = df_merged.merge(df_catyear, how='left', on=['cat', 'year']).rename(columns={'i_x': 'i', 'i_y': 'n_by_catyear'})

Solution

You could use a simple loop and groupby.transform:

groups = ['cat', 'year', ['cat', 'year']]

for g in groups:
    df[f"n_by_{''.join(g)}"] = df.groupby(g)['i'].transform('nunique')

Output:

   year  cat  i  n_by_cat  n_by_year  n_by_catyear
0  2020    1  a         2          2             1
1  2020    1  a         2          2             1
2  2020    2  b         3          2             1
3  2021    2  c         3          2             1
4  2021    3  d         3          2             1
5  2022    3  e         3          1             1
6  2023    1  f         2          2             1
7  2023    2  f         3          2             1
8  2023    3  g         3          2             1
9  2023    4  g         1          2             1