I have the data frame as follows:
df = pd.DataFrame({
'ID': [12, 12, 15, 15, 16, 17, 17],
'Name': ['A', 'A', 'B', 'B', 'C', 'D', 'D'],
'Date':['2019-12-20' ,'2018-12-20' ,'2017-12-20' , '2016-12-20', '2015-12-20', '2014-12-20', '2013-12-20'],
'Color':['Black', 'Blue', 'Red' , 'Yellow' , 'White' , 'Sky' , 'Green']
})
or data table:
ID Name Date Color
0 12 A 2019-12-20 Black
1 12 A 2018-12-20 Blue
2 15 B 2017-12-20 Red
3 15 B 2016-12-20 Yellow
4 16 C 2015-12-20 White
5 17 D 2014-12-20 Sky
6 17 D 2013-12-20 Green
My desired result would be as below table. How could I get that?
ID Name Date Color Date_ Color_
0 12 A 2019-12-20 Black 2018-12-20 Blue
1 15 B 2017-12-20 Red 2016-12-20 Yellow
2 16 C 2015-12-20 White 2015-12-20 White
3 17 D 2014-12-20 Sky 2013-12-20 Green
I need your help, thanks in advance!
Try with this approach:
result = (
df.merge(df, on=['ID', 'Name'])
.drop_duplicates(['ID', 'Name', 'Date_x', 'Color_x'], keep='last')
.drop_duplicates(['ID', 'Name', 'Date_y', 'Color_y'])
.rename(columns={'Date_x': 'Date',
'Color_x': 'Color',
'Date_y': 'Date_',
'Color_y': 'Color_'})
)
This is the result:
ID Name Date Color Date_ Color_
1 12 A 2019-12-20 Black 2018-12-20 Blue
5 15 B 2017-12-20 Red 2016-12-20 Yellow
8 16 C 2015-12-20 White 2015-12-20 White
10 17 D 2014-12-20 Sky 2013-12-20 Green