Search code examples
pythonpandascsvquantitative-financealgorithmic-trading

Manipulating a CSV File using Pandas: Identifying a value in a column and using data from other columns of the same row


Using Pandas, I have created two columns in addition to the OHLC columns, in a CSV file. This is how the data frame looks like:

              Date   Close      ...          Return  bullishSignal
Symbol                          ...                               
SPY     2010-01-01  111.44      ...             NaN          False
SPY     2010-01-04  113.33      ...             NaN          False
SPY     2010-01-25  109.77      ...        0.017582          False
SPY     2010-01-26  109.31      ...       -0.000915          False
SPY     2010-01-27  109.83      ...       -0.000546           True
SPY     2010-01-28  108.57      ...        0.006816          False
SPY     2010-01-29  107.39      ...        0.022721          False

My objective is to calculate the 2 day return from the date ( Day 0 ) that the bullish signal occurs: ( Close (Day 2) - Open (Day 1) ) / Open ( Day 1 ). Is there a way I can do this?


Solution

  • Move to pandas. Then you can transpose row-1 with the command .shift(1) and row+1 with the command .shift(-1).

       x1                   # Original df
    0   0
    1   1 
    2   2
    3   3
    4   4 
    
    df.x2 = df.x1.shift(1)  # Shift down
    
       x1  x2
    0   0 NaN
    1   1   0
    2   2   1
    3   3   2
    4   4   3
    
    
    df.x2 = df.x1.shift(-1)  # Shift up
    
       x1  x2
    0   0   1
    1   1   2
    2   2   3
    3   3   4
    4   4   NaN
    

    Applying shift(), you get all relevant values to one row and then just can apply column-wise operations:

    from __future__ import division
    df['Close_Next_Day']  = df['Close'].shift(-1)
    df = df[df.bullishSignal == True] # Filter to bullish 
    df['2dayReturn_bullish'] = (df['Close_Next_Day'] - df['Open']) / df['Open'])
    

    You can do the same by using pandas percentage change function.