Search code examples
pythonpandasinvoicesupplier

Determinate Consecutive Values (Invoices) Pandas


I have a dataset with supplier and its invoices and I need to determinate which of the invoices are consecutives marking it with a 1 or a 0.

For example:

df1 = pd.DataFrame()
df1['Supplier'] = ['ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC']
df1['Invoice'] = ['101','102','105','109','110','114','120','121','122','130','131']
   Supplier Invoice
0       ABC     101
1       ABC     102
2       ABC     105
3       ABC     109
4       ABC     110
5       ABC     114
6       ABC     120
7       ABC     121
8       ABC     122
9       ABC     130
10      ABC     131

And what I want is a third column like this:

   Supplier Invoice Consecutive
0       ABC     101           0
1       ABC     102           1
2       ABC     105           0
3       ABC     109           0
4       ABC     110           1
5       ABC     114           0
6       ABC     120           0
7       ABC     121           1
8       ABC     122           1
9       ABC     130           0
10      ABC     131           1

EDIT

Thanks for your answers, this options works great, but when I tried it in a real database I realized that I need to compare the suppliers, if the next row is a new supplier the consecutive must be 0.

For example:

   Supplier Invoice Consecutive
0       ABC     101           0
1       ABC     102           1
2       ABC     105           0
3       ABC     109           0
4       ABC     110           1
5       ABC     114           0
6       ABC     120           0
7       ABC     121           1
8       ABC     122           1
9       ABC     130           0
10      ABC     131           1
11      DEF     132           0
12      DEF     133           1
13      GHI     134           0

Thanks in advance!


Solution

  • using npwhere and diff

    # take difference b/w consecutive rows using diff
    # update to 0 when difference is not 1
    
    df1['Consecutive']=(df1.groupby('Supplier')['Invoice']
                        .transform(lambda x: np.where(
                                               x.astype(int).diff() !=1, 0, 1)) )
    df1
    df1
    
        Supplier    Invoice     Consecutive
    0   ABC     101     0
    1   ABC     102     1
    2   ABC     105     0
    3   ABC     109     0
    4   ABC     110     1
    5   ABC     114     0
    6   ABC     120     0
    7   ABC     121     1
    8   ABC     122     1
    9   ABC     130     0
    10  ABC     131     1
    11  DEF     132     0
    12  DEF     133     1
    13  GHI     134     0