I have a dataset with supplier and its invoices and I need to determinate which of the invoices are consecutives marking it with a 1 or a 0.
For example:
df1 = pd.DataFrame()
df1['Supplier'] = ['ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC','ABC']
df1['Invoice'] = ['101','102','105','109','110','114','120','121','122','130','131']
Supplier Invoice
0 ABC 101
1 ABC 102
2 ABC 105
3 ABC 109
4 ABC 110
5 ABC 114
6 ABC 120
7 ABC 121
8 ABC 122
9 ABC 130
10 ABC 131
And what I want is a third column like this:
Supplier Invoice Consecutive
0 ABC 101 0
1 ABC 102 1
2 ABC 105 0
3 ABC 109 0
4 ABC 110 1
5 ABC 114 0
6 ABC 120 0
7 ABC 121 1
8 ABC 122 1
9 ABC 130 0
10 ABC 131 1
EDIT
Thanks for your answers, this options works great, but when I tried it in a real database I realized that I need to compare the suppliers, if the next row is a new supplier the consecutive must be 0.
For example:
Supplier Invoice Consecutive
0 ABC 101 0
1 ABC 102 1
2 ABC 105 0
3 ABC 109 0
4 ABC 110 1
5 ABC 114 0
6 ABC 120 0
7 ABC 121 1
8 ABC 122 1
9 ABC 130 0
10 ABC 131 1
11 DEF 132 0
12 DEF 133 1
13 GHI 134 0
Thanks in advance!
using npwhere and diff
# take difference b/w consecutive rows using diff
# update to 0 when difference is not 1
df1['Consecutive']=(df1.groupby('Supplier')['Invoice']
.transform(lambda x: np.where(
x.astype(int).diff() !=1, 0, 1)) )
df1
df1
Supplier Invoice Consecutive
0 ABC 101 0
1 ABC 102 1
2 ABC 105 0
3 ABC 109 0
4 ABC 110 1
5 ABC 114 0
6 ABC 120 0
7 ABC 121 1
8 ABC 122 1
9 ABC 130 0
10 ABC 131 1
11 DEF 132 0
12 DEF 133 1
13 GHI 134 0