Search code examples
pythonpandasfinance

Checking for subset in a column?


I'm trying to flag some price data as "stale" if the quoted price of the security hasn't changed over lets say 3 trading days. I'm currently trying it with:

firm["dev"] = np.std(firm["Price"],firm["Price"].shift(1),firm["Price"].shift(2))
firm["flag"] == np.where(firm["dev"] = 0, 1, 0)

But I'm getting nowhere with it. This is what my dataframe would look like.

Index Price Flag
1 10 0
2 11 0
3 12 0
4 12 0
5 12 1
6 11 0
7 13 0

Any help is appreciated!


Solution

  • If you are okay with other conditions, you can first check if series.diff equals 0 and take cumsum to check if you have a cumsum of 2 (n-1). Also check if the next row is equal to current, when both these conditions suffice, assign a flag of 1 else 0.

    n=3
    firm['Flag'] = (firm['Price'].diff().eq(0).cumsum().eq(n-1) & 
                    firm['Price'].eq(firm['Price'].shift())).astype(int)
    

    EDIT, to make it a generalized function with consecutive n, use this:

    def fun(df,col,n):
        c = df[col].diff().eq(0)
        return (c|c.shift(-1)).cumsum().ge(n) & df[col].eq(df[col].shift())
    

    firm['flag_2'] = fun(firm,'Price',2).astype(int)
    firm['flag_3'] = fun(firm,'Price',3).astype(int)
    
    print(firm)
    
           Price  Flag  flag_2  flag_3
    Index                             
    1         10     0       0       0
    2         11     0       0       0
    3         12     0       0       0
    4         12     0       1       0
    5         12     1       1       1
    6         11     0       0       0
    7         13     0       0       0