I have df as below. If Column Buy Exit Date
is having Date (YYYY/MM/DD) matching with Column Buy Date
having Date (YYYY/MM/DD) in next row, then drop the Buy Date
row from its matching condition.
import pandas as pd
data = {
'Buy Date': ['2022/11/07 15:00:00', '2022/11/11 12:00:00', '2022/11/24 15:00:00', '2022/12/01 12:00:00', '2022/12/14 09:00:00'],
'Buy Exit Date': ['2022/11/11 09:00:00', '2022/11/24 12:00:00', '2022/12/01 09:00:00', '2022/12/06 09:00:00', '2022/12/16 09:00:00'],
'ln_entry': [3232.899902, 3315.000000, 3381.699951, 3476.149902, 3364.699951]
}
df = pd.DataFrame(data)
df:
Buy Date Buy Exit Date ln_entry
0 2022/11/07 15:00:00 2022/11/11 09:00:00 3232.899902
1 2022/11/11 12:00:00 2022/11/24 12:00:00 3315.000000
2 2022/11/24 15:00:00 2022/12/01 09:00:00 3381.699951
3 2022/12/01 12:00:00 2022/12/06 09:00:00 3476.149902
4 2022/12/14 09:00:00 2022/12/16 09:00:00 3364.699951
Expected Output:
Buy Date Buy Exit Date ln_entry
0 2022/11/07 15:00:00 2022/11/11 09:00:00 3232.899902
1 2022/12/14 09:00:00 2022/12/16 09:00:00 3364.699951
My Code:
df['Buy Date'] = pd.to_datetime(df['Buy Date'])
df['Buy Exit Date'] = pd.to_datetime(df['Buy Exit Date'])
df['Buy Date Only'] = df['Buy Date'].dt.date
df['Buy Exit Date Only'] = df['Buy Exit Date'].dt.date
df['Next_Buy_Date_Only'] = df['Buy Date Only'].shift(-1)
to_remove = df['Buy Exit Date Only'] == df['Next_Buy_Date_Only']
df_cleaned = df[~to_remove]
df_cleaned = df_cleaned.drop(columns=['Buy Date Only', 'Buy Exit Date Only', 'Next_Buy_Date_Only'])
print(df_cleaned)
My Output:
Buy Date Buy Exit Date ln_entry
3 2022-12-01 12:00:00 2022-12-06 09:00:00 3476.149902
4 2022-12-14 09:00:00 2022-12-16 09:00:00 3364.699951
With your current code you are shifting 'Buy Date'
one row back, then checking for equality with 'Buy Exit Date'
and keeping rows where condition is False
. I.e.:
df[['Buy Exit Date Only', 'Next_Buy_Date_Only']]
Buy Exit Date Only Next_Buy_Date_Only
0 2022-11-11 2022-11-11 # equal: delete (via `~`)
1 2022-11-24 2022-11-24 # equal: delete
2 2022-12-01 2022-12-01 # equal: delete
3 2022-12-06 2022-12-14 # unequal: keep
4 2022-12-16 None # unequal: keep
But this is not the requirement. You want to delete rows where the 'Buy Exit Date'
from the previous row equals 'Buy Date'
. Try:
df['Buy Date'] = pd.to_datetime(df['Buy Date'])
df['Buy Exit Date'] = pd.to_datetime(df['Buy Exit Date'])
out = df[df['Buy Date'].dt.date != df['Buy Exit Date'].dt.date.shift(1)]
out
Output:
Buy Date Buy Exit Date ln_entry
0 2022-11-07 15:00:00 2022-11-11 09:00:00 3232.899902
4 2022-12-14 09:00:00 2022-12-16 09:00:00 3364.699951
The comparison being:
Buy Date Buy Exit Date
0 2022-11-07 None # unequal: keep
1 2022-11-11 2022-11-11 # equal: delete
2 2022-11-24 2022-11-24 # equal: delete
3 2022-12-01 2022-12-01 # equal: delete
4 2022-12-14 2022-12-06 # unequal: keep