I have a data frame like this:
Category Date_1 Score_1 Date_2 Score_2
A 13/11/2019 5 13/11/2019 10
A 13/11/2019 5 14/11/2019 55
A 13/11/2019 5 15/11/2019 45
A 13/11/2019 5 16/11/2019 80
A 14/11/2019 3 13/11/2019 10
A 14/11/2019 3 14/11/2019 55
A 14/11/2019 3 15/11/2019 45
A 14/11/2019 3 16/11/2019 80
A 15/11/2019 7 13/11/2019 10
A 15/11/2019 7 14/11/2019 55
A 15/11/2019 7 15/11/2019 45
A 15/11/2019 7 16/11/2019 80
B 13/11/2019 4 13/11/2019 18
B 13/11/2019 4 14/11/2019 65
B 13/11/2019 4 15/11/2019 75
B 13/11/2019 4 16/11/2019 89
B 14/11/2019 9 13/11/2019 18
B 14/11/2019 9 14/11/2019 65
B 14/11/2019 9 15/11/2019 75
B 14/11/2019 9 16/11/2019 89
B 15/11/2019 8 13/11/2019 18
B 15/11/2019 8 14/11/2019 65
B 15/11/2019 8 15/11/2019 75
B 15/11/2019 8 16/11/2019 89
I want to keep the rows where both dates are same. I was doing this:
df.drop_duplicates(subset=['Date_1', 'Date_2'])
But it do not work. Can`t figure out how to drop those extra rows?
Use boolean indexing
with compare both columns:
df1 = df[df['Date_1'] == df['Date_2'])
Or DataFrame.query
:
df1 = df.query("Date_1 == Date_2")