Search code examples
pythonpandastime-seriestimeserieschart

How to compare multiple rows from same column in DataFrame


I have a DataFrame of stock prices and I want to create a column of bool values in a separate DataFrame

The DataFrame with price data has a uses a DateTime index.

I am trying to check if the price of the stock has increased 3 days in a row, if so then I want the other DataFrame to show True on the third day.

Here are all of the things I have tried

df_signal['3DayIncrease'] = df_prices[symbol_name].shift(3) < df_prices[symbol_name].shift(2) < df_prices[symbol_name].shift(1) < df_prices[symbol_name]


df_signal['3DayIncrease'] = df_prices[symbol_name].shift(3) < df_prices[symbol_name].shift(2) & \
                            df_prices[symbol_name].shift(2) < df_prices[symbol_name].shift(1) & \
                            df_prices[symbol_name].shift(1) < df_prices[symbol_name]

df_signal['3DayIncrease'] = df_prices[symbol_name].shift(3) < df_prices[symbol_name].shift(2) and \
                            df_prices[symbol_name].shift(2) < df_prices[symbol_name].shift(1) and \
                            df_prices[symbol_name].shift(1) < df_prices[symbol_name]

I was grasping for straws on this one

df_signals['3DayIncrease'] = ((df_prices[symbol_name].shift(3) < df_prices[symbol_name].shift(2)) == True) and \
                             ((df_prices[symbol_name].shift(2) < df_prices[symbol_name].shift(1)) == True) and \
                             ((df_prices[symbol_name].shift(1) < df_prices[symbol_name]) == True)

If I compare just two rows at a time it works fine. For example

df_signal['3DayIncrease'] = df_prices[symbol_name].shift(3) < df_prices[symbol_name].shift(2)

works without any issues but obviously isn't doing what I want it to.


Solution

  • Your approach works. Just simplify your code a bit

    import numpy as np
    import pandas as pd
    s = pd.DataFrame({"p":[2, 3, 5, 6, 3, 4, 5, 6, 7, 4, 5, 6, 7, 8]})
    s['inc3days'] = ((s['p'].shift(3) < s['p'].shift(2)) &  
                     (s['p'].shift(2) < s['p'].shift(1)) &  
                     (s['p'].shift(1) < s['p']) )
    print s
    

    Here is what I got

        p inc3days
    0   2    False
    1   3    False
    2   5    False
    3   6     True
    4   3    False
    5   4    False
    6   5    False
    7   6     True
    8   7     True
    9   4    False
    10  5    False
    11  6    False
    12  7     True
    13  8     True