How to create new column in Data Frame to sort values for each id based on date column in Python Pandas?

I have Data Frame in Python Pandas like below:

Column my_date is in datetime format.
In my real DataFrame I have many more columns.

id my_date col1

111 2023-05-15 1

111 2023-05-14 11

111 2023-05-13 2

222 2023-10-11 3

222 2023-10-12 55

id	my_date	col1
111	2023-05-15	1
111	2023-05-14	11
111	2023-05-13	2
222	2023-10-11	3
222	2023-10-12	55

And I need to create new column col_x where for each id will be value 1,2,3 and so on based on date in column my_date. So, for each id in column col_x has to be 1 in the early date from my_date and so on.

So, as a result I need something like below:

id	my_date	col1	col_x
111	2023-05-15	1	3
111	2023-05-14	11	2
111	2023-05-13	2	1
222	2023-10-11	3	1
222	2023-12-12	55	2

How can I do that in Python Pandas ?

Solution

Looks like you could use a groupby.rank:

df['col_x'] = df.groupby('id')['my_date'].rank('dense').astype(int)

Which is more or less equivalent to finding the sorting order (e.g. using numpy.argsort, or rank('first')), except it will rank the duplicate values equally:

df['col_x'] = df.groupby('id')['my_date'].transform(np.argsort).add(1)

Output:

    id    my_date  col1  col_x
0  111 2023-05-15     1      3
1  111 2023-05-14    11      2
2  111 2023-05-13     2      1
3  222 2023-10-11     3      1
4  222 2023-10-12    55      2

Example with duplicate values:

    id    my_date  col1  rank('dense')  argsort+1  rank('first')
0  111 2023-05-15     1              3          4              4
1  111 2023-05-14    11              2          2              2
2  111 2023-05-14     9              2          3              3
3  111 2023-05-13     2              1          1              1
4  222 2023-10-11     3              1          1              1
5  222 2023-10-12    55              2          2              2