Search code examples
pythonpython-3.xpandasquantitative-finance

How do you index and use current and previous column values to calculate the next column value in a pandas.apply function?


I'm trying to build a function that calculates trailing stop values using Close and ATR values from a pandas dataframe.

The dataframe for reference is as follows:

            High    Low     Open    Close       ATR 
Date                                
2020-06-01  5.88    5.67    5.73    5.87        0.210000    
2020-06-02  6.00    5.83    5.96    5.90        0.207143    
2020-06-03  6.27    5.92    5.99    6.19        0.218776    
2020-06-04  6.58    6.12    6.20    6.57        0.236006    
2020-06-05  7.50    7.02    7.24    7.34        0.285577    
2020-06-08  7.74    7.37    7.53    7.53        0.293750    
2020-06-09  7.44    7.05    7.22    7.24        0.307053    
2020-06-10  7.34    6.77    7.33    6.81        0.325835    
2020-06-11  6.46    6.04    6.07    6.13        0.357561

What I want it to look like:


            High    Low     Open    Close       ATR         ATR_TS
Date                                
2020-06-01  5.88    5.67    5.73    5.87        0.210000    5.135000
2020-06-02  6.00    5.83    5.96    5.90        0.207143    5.175000
2020-06-03  6.27    5.92    5.99    6.19        0.218776    5.424286
2020-06-04  6.58    6.12    6.20    6.57        0.236006    5.743980
2020-06-05  7.50    7.02    7.24    7.34        0.285577    6.340481
2020-06-08  7.74    7.37    7.53    7.53        0.293750    6.501876
2020-06-09  7.44    7.05    7.22    7.24        0.307053    6.501876
2020-06-10  7.34    6.77    7.33    6.81        0.325835    6.501876
2020-06-11  6.46    6.04    6.07    6.13        0.357561    7.381464

My pseudo-function/logic at the moment looks like this:

def atr_ts(close, atr):
    bigatr = atr*3.5
    buysell = 1 
    stop[i-1] = 0  
    
    if buysell > 0:
        stop = close - bigatr
        stop = max(stop, stop[i-1])
        if close < stop:
            stop = close + bigatr
            buysell = -1

    elif buysell < 0:
        stop = close + bigatr
        stop = min(stop, stop[i-1])
        if close > stop:
            stop = close - bigatr
            buysell = 1
        
            
    return stop

df['ATR_TS'] = df.apply(lambda col: atr_ts(col['Close'], col['ATR']), axis = 1)

So my question is, how do I index the previous stop (ATR_TS) values that are calculated by this function in order to calculate the next stop value, with the first stop value being 0? If anyone sees a better solution to this problem without using pandas.apply, do also share.

I'm new to programming in general so my apologies if clarity is lacking.

Thank you so much.


Solution

  • This won't be the final solution, as I am confused as to why the last value of ATR_TS = 7.381464, although I know how you calculated the value. I created a number of columns to visualize the "pandonic" way of doing some of the calculations using .shift() and .cumsum() for row-wise comparisons. Please take a look at the columns and my screenshot and explain how to get the last value, but in the final solution there is absolutely no need for all of these columns:

    df['bigatr'] = (df['ATR'] * 3.5)
    df['Stop1a'] = df['Close'] - (df['ATR'] * 3.5)
    df['Stop2a'] = df.shift()['Close'] - (df.shift()['ATR'] * 3.5)
    df['Stop3a'] = df[['Stop1a','Stop2a']].max(axis=1)
    df['Stop1b'] = df['Close'] + (df['ATR'] * 3.5)
    df['Stop2b'] = df.shift()['Close'] + (df.shift()['ATR'] * 3.5)
    df['Stop3b'] = df[['Stop1b','Stop2b']].min(axis=1)
    df['cuma'] = (df['Stop1a'] > df.shift()['Stop1a']).cumsum()
    df['cumb'] = (df['Stop1b'] < df.shift()['Stop1b']).cumsum()
    df['ATR_TSa'] = df.groupby((df['Stop1a'] > df.shift()['Stop1a']).cumsum())['Stop1a'].transform('first')
    df['ATR_TSb'] = df.groupby((df['Stop1b'] < df.shift()['Stop1b']).cumsum())['Stop1b'].transform('first')
    df
    

    enter image description here

    As you can see, the final solution is all of the ['ATR_TSa'] values circled in red and the df['ATR_TSb'] value circled in blue in the last row.


    EDIT #1 - Per OP's comment, the final logic to solve the above is to add:

     df['ATR_TS'] = np.where((df['Close'] < df['ATR_TSa']), df['ATR_TSb'], df['ATR_TSa'])
    

    Now, below here, I will provide a more concise solution:

    df['Stop1a'] = df['Close'] - (df['ATR'] * 3.5)
    df['Stop1b'] = df['Close'] + (df['ATR'] * 3.5)
    a = df.groupby((df['Stop1a'] > df.shift()['Stop1a']).cumsum())['Stop1a'].transform('first')
    b = df.groupby((df['Stop1b'] <= df.shift()['Stop1b']).cumsum())['Stop1b'].transform('first')
    df['ATR_TS'] = np.where((df['Close'] < a), b, a)
    df = df.drop(['Stop1a','Stop1b'], axis=1)
    df
    
    
    Out[1]: 
             Date  High   Low  Open  Close       ATR    ATR_TS
    0  2020-06-01  5.88  5.67  5.73   5.87  0.210000  5.135000
    1  2020-06-02  6.00  5.83  5.96   5.90  0.207143  5.175000
    2  2020-06-03  6.27  5.92  5.99   6.19  0.218776  5.424284
    3  2020-06-04  6.58  6.12  6.20   6.57  0.236006  5.743979
    4  2020-06-05  7.50  7.02  7.24   7.34  0.285577  6.340480
    5  2020-06-08  7.74  7.37  7.53   7.53  0.293750  6.501875
    6  2020-06-09  7.44  7.05  7.22   7.24  0.307053  6.501875
    7  2020-06-10  7.34  6.77  7.33   6.81  0.325835  6.501875
    8  2020-06-11  6.46  6.04  6.07   6.13  0.357561  7.381463