I have two DataFrames with the following layouts:
QUOTES DataFrame
DATE PRICE SMA
2008-06-25 107.505122 106.480321
2008-06-26 107.138449 103.531552
2008-06-27 106.737588 102.608466
2008-06-30 106.360232 101.309296
2008-07-01 105.993987 101.941783
2008-07-02 105.632870 101.796483
SIGNALS DataFrame
DATE LONG ENTRY LONG EXIT
2008-06-25 0 0
2008-06-26 0 1
2008-06-27 1 0
2008-06-30 0 1
2008-07-01 0 1
2008-07-02 1 0
The first DataFrame is stock prices and moving averages. The second DataFrame contains signals for when to purchase a stock (entry) and when to sell (exit).
The entry part is already ok. But i'm having problems with the exit part.
It would not make sense to have an exit signals on the 2008-06-26
because no stock has been purchased yet. And it would not make sense to have an exit signal on both 2008-06-30
and 2008-07-01
because we can't sell the same stocks twice.
So I have a way to generate signals in the LONG EXIT
column, but I need to filter them by looking backwards from each date to figure out if the is a LONG ENTRY = 1
earlier and no LONG EXIT = 1
between the LONG ENTRY = 1
and the DATE
I'm looking at.
The DataFrame that I need looks like this, but how can I do that with Pandas?
SIGNALS DataFrame
DATE LONG ENTRY LONG EXIT
2008-06-25 0 0
2008-06-26 0 0
2008-06-27 1 0
2008-06-30 0 1
2008-07-01 0 0
2008-07-02 1 0
Here's a sketch of how you could track the balance of your entry/exit signals so you only signal exit when there's a previous entry not yet canceled by subsequent exit:
Starting with:
date entry exit
0 2008-06-25 0 0
1 2008-06-26 0 1
2 2008-06-27 1 0
3 2008-06-30 0 1
4 2008-07-01 0 1
5 2008-07-02 1 0
Add a column
for your new, filtered signals:
df['filtered_exit'] = np.nan
Iterate through the DataFrame
, calculate the prior balance (this assumes you'd not be signaling entry and exit for the same day), and filter accordingly:
for i, data in df.iterrows():
holding_status = (df.loc[:(i-1), 'entry'] - df.loc[:(i-1), 'filtered_exit']).sum()
if data.exit == 0 or i == 0:
df.loc[i, 'filtered_exit'] = 0
elif data.exit == 1:
if holding_status == 0:
df.loc[i, 'filtered_exit'] = 0
elif holding_status == 1:
df.loc[i, 'filtered_exit'] = 1
to get the desired result given the cases you mentioned:
date entry exit filtered_exit
0 2008-06-25 0 0 0
1 2008-06-26 0 1 0
2 2008-06-27 1 0 0
3 2008-06-30 0 1 1
4 2008-07-01 0 1 0
5 2008-07-02 1 0 0
Hope this helps.