Search code examples
pythonpandaslist

return if list is consecutive within python col


I've got a column within a df labelled a. It contains a list of int values. I want to return if they are consecutive or not.

I can do it passing in a single list but I want to iterate over each row.

df = pd.DataFrame({'a': [[0,2], [9,11,12], [0,1,2], [10,11,13]]})

def cons(L):
    return all(n-i == L[0] for i,n in enumerate(L))

print(cons(df['a'][0])) # works

df['cons'] = df['a'].apply(cons, axis=1) # error

intended:

              a    cons
0        [0, 2]   False
1   [9, 11, 12]   False 
2     [0, 1, 2]    True
3  [10, 11, 13]   False

Solution

  • Use list comprehension with test if sorted values are same like generated list by minimal and maximal values of lists:

    df['cons'] = [sorted(l) == list(range(min(l), max(l)+1)) for l in df['a']]
    #alternative
    df['cons'] = df['a'].apply(lambda l: sorted(l) == list(range(min(l), max(l)+1)))
    

    Another idea is use np.diff for test if difference is 1 for all values:

    df['cons'] = [np.all(np.diff(sorted(l)) == 1) for l in df['a']]
    #alternative
    df['cons'] = df['a'].apply(lambda l: np.all(np.diff(sorted(l)) == 1))
    

    If want use your solution:

    def cons(L):
    
        return all(n-i==L[0] for i,n in enumerate(L))
    
    df['cons'] = df['a'].apply(cons)
    #alternative
    df['cons'] = [all(n-i==L[0] for i,n in enumerate(L)) for L in df['a']]
    

    print (df)
                  a   cons
    0        [0, 2]  False
    1   [9, 11, 12]  False
    2     [0, 1, 2]   True
    3  [10, 11, 13]  False