I have a dataframe like:
date region code name
0 1 a 1 x
1 2 a 1 y
2 1 b 1 y
3 2 b 1 w
4 1 c 1 y
5 2 c 1 y
6 1 a 2 z
7 2 a 2 z
Each code should have a unique name, however it can change over time. I want to select the last name for each region and code combination. It is important to notice that some codes only appear in certain regions.
date region code name
0 1 a 1 y (changed)
1 2 a 1 y
2 1 b 1 w (changed)
3 2 b 1 w
4 1 c 1 y
5 2 c 1 y
6 1 a 2 z
7 2 a 2 z
I tried doing it using some for loops with no success.
Use GroupBy.transform
with GroupBy.last
:
#if necessary sorting
#df = df.sort_values(['region','code','date'])
df['name'] = df.groupby(['region','code'])['name'].transform('last')
print (df)
date region code name
0 1 a 1 y
1 2 a 1 y
2 1 b 1 w
3 2 b 1 w
4 1 c 1 y
5 2 c 1 y
6 1 a 2 z
7 2 a 2 z