Search code examples
pythonpandasgroup-bytime-seriesmulti-index

Replace specific values using a combination of other ones in a pandas time-series


I have a dataframe like:

   date region code name
0  1      a      1    x
1  2      a      1    y
2  1      b      1    y
3  2      b      1    w
4  1      c      1    y
5  2      c      1    y
6  1      a      2    z
7  2      a      2    z

Each code should have a unique name, however it can change over time. I want to select the last name for each region and code combination. It is important to notice that some codes only appear in certain regions.

   date region code name
0  1      a      1    y   (changed)
1  2      a      1    y
2  1      b      1    w   (changed)
3  2      b      1    w
4  1      c      1    y 
5  2      c      1    y   
6  1      a      2    z
7  2      a      2    z

I tried doing it using some for loops with no success.


Solution

  • Use GroupBy.transform with GroupBy.last:

    #if necessary sorting
    #df = df.sort_values(['region','code','date'])
    
    df['name'] = df.groupby(['region','code'])['name'].transform('last')
    print (df)
       date region  code name
    0     1      a     1    y
    1     2      a     1    y
    2     1      b     1    w
    3     2      b     1    w
    4     1      c     1    y
    5     2      c     1    y
    6     1      a     2    z
    7     2      a     2    z