Search code examples
pythonpandasdataframerow-number

How to add row series per unique column value pandas dataframe?


I would like to number the rows in in my data frame (and add this as a column) where the counting starts at 1 again for every distinct Number. I have tried using df['Row number'] = np.arange(len(df)) but this gives a continuous numbering of the rows.

Example of the data frame I have:

Number Value
1234   a
1234   b
1234   x
5678   t
5678   y
5678   p

Example of the data frame I want:

Number Value   Row number
1234   a       1
1234   b       2
1234   x       3
5678   t       1
5678   y       2
5678   p       3

Anyone knows how I can do this or what function I should use? Thanks!


Solution

  • I believe you're looking for groupby and cumcount(), with a +1 as the default would be to start from 0:

    df['Row number'] = df.groupby('Number').cumcount() + 1
    
    print(df)
    
       Number Value  Row number
    0    1234     a           1
    1    1234     b           2
    2    1234     x           3
    3    5678     t           1
    4    5678     y           2
    5    5678     p           3