I would like to number the rows in in my data frame (and add this as a column) where the counting starts at 1 again for every distinct Number. I have tried using df['Row number'] = np.arange(len(df))
but this gives a continuous numbering of the rows.
Example of the data frame I have:
Number Value
1234 a
1234 b
1234 x
5678 t
5678 y
5678 p
Example of the data frame I want:
Number Value Row number
1234 a 1
1234 b 2
1234 x 3
5678 t 1
5678 y 2
5678 p 3
Anyone knows how I can do this or what function I should use? Thanks!
I believe you're looking for groupby
and cumcount()
, with a +1
as the default would be to start from 0
:
df['Row number'] = df.groupby('Number').cumcount() + 1
print(df)
Number Value Row number
0 1234 a 1
1 1234 b 2
2 1234 x 3
3 5678 t 1
4 5678 y 2
5 5678 p 3