Search code examples
pandasisin

Pandas: check a sequence in one column for each unique value in another column


I have a table that looks like this:

Date Unique id Indicator
2018 1 1
2019 1 0
2020 1 0
2020 2 1
2018 2 0
2019 2 1
2020 2 1
2021 2 1

For each value in "Unique id" I want to check whether "Indicator" match a special sequence of values (say, [1,1,1])

I have tried using df.groupby(['Unique id'])['Indicator].isin([1,1,1]), but the groupby method has no isin method.

The desired output would be:

Unique id Sequence match
1 False
2 True

Solution

  • You can do rolling().apply() like this:

    def check_pattern(s, pattern):
        return s.eq(pattern).all()
    
    (df.groupby('Unique id', group_keys=False)['Indicator'].rolling(3)
       .apply(check_pattern, kwargs={'pattern':(1,1,1)})
       .groupby('Unique id').max().eq(1)
    )
    

    Output:

    Unique id
    1    False
    2     True
    Name: Indicator, dtype: bool